Friday, May 18, 2012

Object Recognition Methods

Object recognition refers to the process by which computers and robotic devices identify and interpret visual cues in their environment. Different approaches have their strengths and weaknesses, but the goal is the same: to provide the computer or robotic device with the data it needs in order for the software to make an informed decision, or at least an educated guess based on a predetermined set of variables.


Computer Interpretation of Visual Data


Computers interpret data using several different techniques, all of which are expressed in the form of lengthy and complex algebraic equations significantly beyond the grasp of the average layperson. Suffice to say these equations are very different depending on the strategic method of visual interpretation. One of the primary techniques for interpretation of visual data is using a large model base. At the core of this method of visual determination is simple geometry stored as a large-scale database; the computer can make a comparison using these base templates. A collection of geometric shapes and patterns can be recognized by the computer. The obvious weakness of this interpretation is that if the shape is not seen in the database again, the computer will be unable to interpret the data.








Another common but entirely different approach is the edge matching technique, which detects edges in both a template and an image while comparing the visual input and template data with a predetermined range of possibilities and variations. Edge matching has the advantage of being more adaptive than simple model-based comparisons, based on the fact that it can operate within a larger field of variables.


A different approach to visual data interpretation is the use of gray scale matching, adept at sensing variations in illumination using pattern maps as opposed to geometric maps to form a basis for comparison. Gray scale matching has found a lot of use in webcam recognition software-based advertising campaigns, allowing magazine readers to show an image to their webcams interacting with a specific website, initiating special results when the website software is showing the image of their advertisements.








Other Visual Corellation Methods


Though edge matching and large model bases are the most popular methods of interpreting visual data, a number of less common approaches to the problem exists. The process of a divide and conquer search uses image cell-based data sets and advanced algebraic equations in order to ignore confusing or irrelevant data and make simplified generalizations about the visual data. A divide and conquer search-based visual and rotation would be helpful in the application of quality control, where defects and other errors do not meet a uniform searchable pattern (for example, noticing flaws in the manufacture of textiles or in fractures of ceramic objects). Another method for visual data interpretation is that of gradient matching, in essence a simplified form of gray scale matching that uses light levels instead of preset colors to determine the rough shape of an object. The primary advantage to gradient matching is that more of the data collected can be used for the correlation process, thereby reducing processor loads. In practical applications, gradient matching gives a more accurate image interpretation than the more basic gray scale matching.


Overall Image Analysis


More to visual data interpretation exists than the various methods and subsequent equations and database. Also take into account the "brain" of the interpreting computer. Software forms the basis for the rudimentary decision-making skills a device will use upon receiving visual stimuli. Programmers determine which choices the device will make based on a predetermined series of scenarios. For something such as a small robot, programmers need to give the device programming to prevent it from colliding with objects. The most basic method of command is that of an "if, then" programming sequence; the basic premise of such a statement is that "if" the computer is presented with a series of specific variables (visual data indicating a nearby obstacle, for example), "then" take a predetermined course of action (such as changing direction). An "if, then" statement can also be used in industrial production situations where a supervising technician must be notified if an anomaly is detected.

Tags: visual data, scale matching, data interpretation, gradient matching, method visual, visual data interpretation