Computers That Recognize What They See

Comments Off on Computers That Recognize What They See
Computers That Recognize What They See

It is easy to take the sense of sight for granted. We simply open our eyes and see. But, in fact, the process of capturing the light that falls onto the retina of each eye and transforming it into an understanding of the physical world in front of us is nothing short of amazing.

Typically, when we walk into a room and look around, we identify virtually every object in our area of vision within just a few hundred milliseconds, regardless of the lighting or our angle of sight. Yet, despite decades of advancements in sensor technology and artificial intelligence, the visual ability that seems so easy for any human who is not blind is still well beyond the capability of computers.

The hurdle has not been in designing computers to see, that is, to capture light and translate the photons into an electronic pattern. Any $20 webcam can do that. The challenge has been in creating computers that can recognize and understand what it is that they are seeing. In fact, the processing that we do so quickly is an incredible feat for a machine and requires an enormous amount of computing compressed into a tiny sliver of time.

Since computers excel at pattern recognition and iterative processing, scientists once believed that machines were perfectly suited to this task. But, the fit doesn't exist because there is so more to it than simply looking for patterns in the pixels.

For example, beyond simply crunching numbers, machine vision requires the computer to make sense of ambiguous visual data about objects, while separating any movement of those objects from the movement of the observer, or the movement of the other items in the room.

Another problem is the enormous degree of variation in images. That variation has become the Achilles heel of every optical recognition algorithm. Why? Because a computer algorithm looks at an object as a pixel outline. When the object or the observer moves even just a bit, the computer code "sees" it as a totally new thing.

A human can recognize a desktop keyboard, for instance, at any angle and in virtually any light. We also recognize other versions of keyboards, such as ones on smart phones and laptops. On the other hand, recognizing a "phone's keypad" as a type of keyboard is tough enough. But turn the phone to a side view, and the new angle will invariably stump the computer.

This is a significant challenge, because for computers to be useful as stand-alone visual tools or as part of a robotic system, they'll need to recognize objects in a wide variety of lighting conditions, and from the many different angles they will encounter in the real world...

To continue reading, become a paid subscriber for full access.
Already a Trends Magazine subscriber? Login for full access now.

Subscribe for as low as $195/year

  • Get 12 months of Trends that will impact your business and your life
  • Gain access to the entire Trends Research Library
  • Optional Trends monthly CDs in addition to your On-Line access
  • Receive our exclusive "Trends Investor Forecast 2015" as a free online gift
  • If you do not like what you see, you can cancel anytime and receive a 100% full refund