The team's AI agent can execute 1,000 of these interactions in the Sims-style world, with eight different scenes including a living room, kitchen, dining room, bedroom, and home office.
"Describing actions as computer programmes has the advantage of providing clear and unambiguous descriptions of all the steps needed to complete a task," said Xavier Puig, a PhD student at MIT.
"These programmes can instruct a robot or a virtual character, and can also be used as a representation for complex tasks with simpler actions," said Puig.
Unlike humans, robots need more explicit instructions to complete easy tasks - they can not just infer and reason with ease.
For example, one might tell a human to "switch on the TV and watch it from the sofa." Here, actions like "grab the remote control" and "sit/lie on sofa" have been omitted, since they're part of the commonsense knowledge that humans have.
To better demonstrate these kinds of tasks to robots, the descriptions for actions needed to be much more detailed. To do so, the team first collected verbal descriptions of household activities, and then translated them into simple code.
A programme like this might include steps like: walk to the television, switch on the television, walk to the sofa, sit on the sofa, and watch television.
Once the programmes were created, the team fed them to the VirtualHome 3D simulator to be turned into videos. Then, a virtual agent would execute the tasks defined by the programs, whether it was watching television, placing a pot on the stove, or turning a toaster on and off.
The end result is not just a system for training robots to do chores, but also a large database of household tasks described using natural language.
Companies like Amazon that are working to develop Alexa-like robotic systems at home could eventually use data like this to train their models to do more complex tasks.
(This story has not been edited by Business Standard staff and is auto-generated from a syndicated feed.)