Just what did I want Flexible Frank to do? Answer: any work a human being does around a house. He didn't have to play cards, make love, eat or sleep, but he did have to clean up after the card game, cook, make beds, and tend babies - at least he had to keep track of a baby's breathing and call someone if it changes.
...We need to train them. Hence, VirtualHome, the activity simulator for teaching household tasks to robots.
(VirtualHome video: Teaching Robots To Do Chores)
In this paper, we are interested in modeling complex activities
that occur in a typical household. We propose to use
programs, i.e., sequences of atomic actions and interactions,
as a high level representation of complex tasks. Programs
are interesting because they provide a non-ambiguous representation
of a task, and allow agents to execute them.
However, nowadays, there is no database providing this type
of information. Towards this goal, we first crowd-source
programs for a variety of activities that happen in people’s
homes, via a game-like interface used for teaching kids how
to code. Using the collected dataset, we show how we can
learn to extract programs directly from natural language
descriptions or from videos. We then implement the most
common atomic (inter)actions in the Unity3D game engine,
and use our programs to “drive” an artificial agent to execute
tasks in a simulated household environment. Our
VirtualHome simulator allows us to create a large activity
video dataset with rich ground-truth, enabling training and
testing of video understanding models. We further showcase
examples of our agent performing tasks in our VirtualHome
based on language descriptions.