Discovering task-relevant objects from egocentric video sequences of multiple users, using appearance, position, motion and attention features.
Distinguishing different ways in which a task-relevant object has been used.
Automatically extracting usage snippets, to be used for video-based guidance.
Tested on a variety of daily tasks such as initialising a printer, preparing a coffee and setting up a gym machine.