Extended Abstract accepted at CHI 2026: Teaching Cobots What to Do by Watching an Expert

DELEGACT: Let the Robot Watch, Then Decide Who Does What

Our extended abstract "Learning to Delegate and Act with DELEGACT: Multimodal Language Models for Task-Level Human–Cobot Planning in Industrial Assembly" has been accepted at CHI 2026 in Barcelona. This is work by Bram Verstappen together with Dries Cardinaels, Danny Leen, and Raf Ramakers at the Digital Future Lab (UHasselt - Flanders Make).

The DELEGACT interface showing a proposed human–cobot task split for an assembly workflow

The Problem: Setting Up Human–Robot Teamwork Is Hard

In modern manufacturing, humans and collaborative robots (cobots) work side by side — each doing what they’re best at. People bring situational awareness, dexterity, and judgement; cobots bring consistency, precision on well-defined tasks, and the ability to handle physically demanding or repetitive work without fatigue.

But before any of that collaboration can happen, someone has to figure out the task split: who does what, and in what order? This is called the Robotic Assembly Line Balancing Problem, and solving it is currently a slow, expert-driven process. You need to manually describe every assembly step, encode constraints about safety, weight, and ergonomics, and then reason over all of it — for every new product or process change.

Our Idea: Just Show It

DELEGACT takes a different starting point. Instead of requiring a formal task model, you simply record an expert doing the assembly while talking through what they’re doing. From that narrated video, the system produces a proposed human–cobot task split, ready for the operator to review and adjust.

The pipeline has three stages:

Task extraction — A Vision-Language Model (VLM) watches the demonstration video and, combined with the speech transcript, breaks the assembly down into atomic, self-contained actions. Steps are split until no action can be meaningfully divided further.
Task delegation — An LLM reasons over each extracted task and assigns it to either the human or the cobot, taking into account robot specifications (payload, end effectors, reach), operator competencies, and product information from the bill of materials (part weights, materials, tolerances).
Execution support — For cobot tasks, the system generates step-by-step robot instructions from a predefined behavior library, which the operator can trigger directly from the interface.

Human Always in the Loop

Crucially, DELEGACT doesn’t hand control to the AI. The interface shows the system’s reasoning for every delegation decision — why a task was assigned to the human or the cobot. Operators can override any assignment, and the system updates related tasks to keep the plan consistent. Tasks can also be “pinned” to preserve assignments while the rest of the plan is regenerated.

Two Illustrative Cases

We tested the system on two assembly scenarios with an Igus ReBeL 6-DOF cobot:

Assembling a laser-cut VR headset — A simpler case that stressed precise alignment. The system correctly identified insertion steps as human tasks (too fine-grained for the cobot) while finding opportunities for the robot to hand parts to the operator and act as a “third hand” to hold components steady during tricky alignment steps.

Assembling an air compressor — A more complex workflow with bolt tightening, bearing seating, and heavy parts. DELEGACT differentiated between repetitive actions (assigned to the cobot) and precision-critical steps (assigned to the human), and used robot constraints to keep heavy components with the human. It also proposed preparatory steps like kitting-tray organization to make subsequent cobot actions feasible.

Why This Matters

The broader goal is to lower the barrier to human–robot collaboration in industry, making it accessible without requiring specialized robotics or programming expertise. DELEGACT positions AI as an assistive planning tool — one that accelerates the design of collaboration workflows while keeping the operator firmly in control of the final decisions.

This work fits well with the Industry 5.0 focus: automation that supports and amplifies human workers rather than replacing them.

Citation

@inproceedings{verstappen2026delegact,
  author = {Verstappen, Bram and Cardinaels, Dries and Leen, Danny and Luyten, Kris and Ramakers, Raf},
  title = {Learning to Delegate and Act with {DELEGACT}: Multimodal Language Models for Task-Level Human--Cobot Planning in Industrial Assembly},
  booktitle = {Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems},
  series = {CHI EA '26},
  year = {2026},
  location = {Barcelona, Spain},
  publisher = {ACM},
  address = {New York, NY, USA},
  doi = {10.1145/3772363.3798803}
}

We look forward to presenting this work in Barcelona in April!