DARPA Programs Robots to Learn by Watching YouTube Videos

CS professor Yiannis Aloimonos (center) led a team that programmed robots to learn by watching YouTube videos. (PHOTO CREDIT: Univ. Maryland/John T. Consoli)

 

The Pentagon’s most advanced tech-development wing has succeeded in developing a mathematical language so advanced it could allow robots to learn by watching YouTube videos.

The Defense Advanced Research Projects Agency (DARPA) issued a series of grants in 2011 to fund research into ways to create a mathematical language that would allow the military to combine data from drone video, cell-phone intercepts, targeting radar and any other available method of sensing the outside world into a single stream of data, but that was only the initial goal.

The real intention was to create a mathematical model that would allow advanced sensors to figure out which of the things they see or hear are important and filter out those that are trivial before passing them along to humans. Sensors designed only to see what’s happening, not decide whether it’s important, “process their signals as if they were seeing the world anew at every instant,” according to the 2011 solicitation for proposals under the Mathematics of Sensing, Exploitation, and Execution (MSEE) project.

“The MSEE program initially focused on sensing, which involves perception and understanding of what’s happening in a visual scene, not simply recognizing and identifying objects,” according to Reza Ghanadan, a program manager in DARPA’s Defense Sciences Offices.

“We’ve now taken the next step to execution, where a robot processes visual cues through a manipulation action-grammar module and translates them into actions,” Ghanadan said.

Developing an algorithm that can effectively identify objects, actions and figure out which are important and which to ignore – something even the human brain does only imperfectly and inconsistently – requires that machines be capable not only of learning, but learning “in an unsupervised or semi-supervised fashion,” and process data in ways that mimic some aspects of human judgment, according to the original requirement.

The first result of that effort is a robot programmed by researchers at the University of Maryland that was able to teach itself to use kitchen tools by watching humans do it in videos on YouTube, according to a release yesterday from DARPA, an announcement from the University and a research paper presented yesterday at the Association for the Advancement of Artificial Intelligence Conference in Austin, Texas.

Project, led by computer scientist Yiannis Aloimonos, modified several semi-humanoid Baxter Research Robots by adding a pair of data-processing modules designed as convolutional neural networks (CNN) –a design that also powers voice-recognition systems in smartphones and facial-recognition software used in security biometrics.

One of the two CNN modules was designed to recognize objects. The other was programmed to track movements – following not only objects in motion, but also creating an abstracted mathematical model that would help identify how each part of a movement related to the others and how, eventually, the robot could reproduce the movement itself,

Click here to read more.

SOURCE: Computer World | Kevin Fogarty