September 8th, 2021, 16:00 PM – CGI Keynote 3

Prof. Yiannis Aloimonos – Computer Vision Laboratory University of Maryland
Title : The action grammar: grounding of cognition
Abstract : Context-free grammars have been in fashion in linguistics because they provide a simple and precise mechanism for describing the methods by which phrases in some natural language are built from smaller blocks. Also, the basic recursive structure of natural languages, the way in which clauses nest inside other clauses, and the way in which lists of adjectives and adverbs are followed by nouns and verbs, is described exactly. Similarly, for manipulation actions, every complex activity is built from smaller blocks involving hands and their movements, as well as objects, tools and the monitoring of their state. Thus, interpreting a “seen” action is like understanding language, and executing an action from knowledge in memory is like producing language. Several experiments will be shown interpreting human actions in the arts and crafts or assembly domain, through a parsing of the visual input, on the basis of the manipulation grammar. This parsing, in order to be realized, requires a network of visual processes that attend to objects and tools, segment them and recognize them, track the moving objects and hands, and monitor the state of objects to calculate goal completion. These processes will also be explained and we will conclude with demonstrations of robots learning how to perform tasks by watching videos of relevant human activities. Due to the interest of the audience in Graphics and Augmented Reality, a significant part of the talk will be devoted to Topology Aware Point- Cloud Registration, a set of processes that allow an observer to understand topological changes in the scene, like contact, the opening/closing of a drawer or door, the cutting of a vegetable with a knife, the assembly of two pieces into one.
Bio : Yiannis Aloimonos is Professor of Computational Vision and Intelligence at the Department of Computer Science, University of Maryland, College Park, and the Director of the Computer Vision Laboratory at the Institute for Advanced Computer Studies (UMIACS). He is also affiliated with the Institute for Systems Research, the Electrical and Computer Engineering and the Neural and Cognitive Science Program. He was born in Sparta, Greece and studied Mathematics in Athens and Computer Science at the University of Rochester, NY (PhD 1990). He is interested in Active Perception and the modeling of vision as an active, dynamic process for real time robotic systems. Recently he has been developing active vision solutions for tiny flying agents. For the past five years he has been working on bridging signals and symbols, specifically on the relationship of vision to reasoning, action and language.