Skip to main content

Highlight

What the Human Eye Can Tell the Computer's Brain

Achievement/Results

Enabling computers to depict and understand visual “shape” is a central problem in computer vision, but modern methods still fall far short of the human eye and brain. Researchers involved in Rutgers University’s NSF funded Interdisciplinary Graduate Education Research and Training program in Perceptual Science are trying to combine decades of insight into how the human mind understands shapes with modern advances in image processing algorithms. A key to the problem is how the human mind decomposes shapes into parts – visual “pieces” – like limbs on an animal or the fingers of a hand. The parts of a shape are the keys not only to recognizing it, but to understanding how it moves in the heat of real-life action, such as when an animal runs by moving its limbs relative to its body.

Several lines of interconnecting research under the Perceptual Science IGERT, involving both visual psychologists and computer scientists, have made breakthroughs in understanding how visual shapes can be decomposed into perceptually natural component parts. Rutgers researchers Jacob Feldman and Manish Singh have developed a comprehensive new approach to the problem of representing visual shape, a central problem in human and computer perception. The approach is based on Bayesian estimation of the shape “skeleton”, a representation of the structural configuration of a shape’s component parts that corresponds closely to how the human visual system intuitively parses it. An example of a shape, along with its computed “skeleton” is shown below. This new approach unifies computational and human perceptual approaches, and, as developed by IGERT trainees Erica Briscoe and John Wilder, solves key outstanding problems including: the automatic computational decomposition of shapes into natural parts, the estimation of 3D surface structure from a 2D outline, the determination of intuitive similarity between distinct shapes, and the formation of meaningful categories of related shapes. Briscoe’s experiments show that the skeleton-based similarity metric closely approximates how humans recognize and classify shapes, even complex multi-part shapes, and a previously unattainable achievement. Wilder is extending the work on classification to show how the skeleton predicts the categorization of shapes into natural classes, such as animal and plant

The segmentation of a shape into meaningful parts is also central to the development of interactive computer applications, such as the creation of computer interfaces that people can use to manipulate shapes. Inspired by the importance of the part structure of a shape in both human perception and computer applications, Rutgers researcher Doug DeCarlo and his students have developed a new model for decomposing a shape into parts using methods in differential geometry to analyze the curvature of the shape’s contour and skeleton. The key to the approach is the use of geometric relatability, a measure of how easily two separate curve segments join together into a single smooth curve. In contrast to earlier approaches to shape segmentation, DeCarlo’s model does not simply divide a shape, but instead allows decomposition of a shape into perceptually meaningful components. As a result, when a part is removed no trace of it remains in the shape, and the part itself takes on the characteristics of an interpretable shape. This novel approach to shape segmentation is part of DeCarlo’s overall research program of developing meaningful graphical computer interfaces: representations that convey the meaning of a visual array in ways that people can readily understand and use in interactive applications.

Address Goals

The past decade has seen a proliferation of perceptual technologies, from automated systems that recognize faces, voices or 3D scenes, to dynamic multimodal interfaces and realistic virtual environments that enrich our communication with computers and with each other. It is now widely recognized that for these developments to succeed, it is imperative that they be based on firm knowledge and theories of human perceptual function. This activity shows the importance and value of developing novel perceptual graphical interfaces based on theories of human perceptual function. Often the most effective computer-generated images are not literal depictions of a scene, but rather, meaningful abstractions that convey essential information in a way that is readily understood. A developing field within computer graphics consists of attempts to create usable images – “visual explanations” – that are easy to understand because their structure and design is inherently compatible with the way the human perceptual system perceives and extracts meaning from a scene over space and time. Examples are found in the generation of assembly instructions, line drawings of 3-dimensional objects, and artistic rendering of natural scenes. These efforts depend on incorporating models of human perceptual analysis into the rules governing the design. The particular activity described in the highlight shows that advances in the understanding of how people perceive, recognize and classify shapes – in particular, the importance of the part structure of a shape – has motivated and guided new geometric and computational methods to create meaningful depictions of shapes that can be understood and manipulated by human perceivers.