The new face of security
- By Michelle Speir
- Mar 03, 2002
Disguised with fake glasses, a moustache and a wig, and armed with excellent forged identification, a terrorist is boarding a flight in San Francisco headed for Washington, D.C. What he doesn't know is that the face of each person in line is being scanned by a video camera.
The image is then being analyzed and compared to a database of suspects and, by the time he reaches the counter, a light flashes on the ticket agent's screen. The police are summoned by silent alarm, and the terrorist is arrested before he takes his seat.
No, all the pieces of such a system are not yet in place, but they will be soon. Increased concern over airline safety since Sept. 11 has brought a surge of interest in facial-recognition technology as a way to prevent dangerous people from boarding airplanes or gaining access to sensitive facilities. Indeed, a number of airports from Oakland, Calif., to Boston have recently announced plans to use the technology to screen for terrorists.
It's turning out to be a renaissance for a technology that has existed for years, but has been used only in small circles within government. So although many people are aware that the technology exists, few have had any experience with it. That could change in the months ahead.
Largely because it is less intrusive than other biometric tools, such as iris scanners and fingerprint readers, facial recognition is expected to be one of the fastest-growing segments of the biometric market during the next two to three years, according to the International Biometric Group LLC (IBG), a New York City-based integration and consulting firm.
Possible government uses for facial-recognition technology extend beyond airport screening to controlling access to computer systems, restricted areas, entitlement benefits and the nation's borders.
Although the specifics have not yet been determined, the government is considering using biometric technology for several projects, according to the Biometric Consortium, a group of government representatives that helps coordinate the evaluation of biometrics for possible government applications.
For example, the State Department is considering using biometrics to aid in passport processing, and the Defense Department is researching its possible use for computer network security.
This article, intended as a basic guide to facial-recognition technology, examines the underlying principles of the technology, explains the various technical approaches and highlights factors that can determine how useful the technology ultimately is. We also take a look at a representative sampling of solutions now on the market (see "About face," at left).
Facial recognition — also known as facial scan or face verification — is a biometric technology that identifies people based on their facial features. Facial-scan systems can recognize a person by using one of several possible methods, all of which emphasize parts of the face that are not easy to alter, such as the areas around the cheekbones, the upper outlines of the eye sockets and the sides of the mouth.
Systems generally work by comparing the facial scan of an individual to facial scans stored in a database. In anti-terrorism applications, the system attempts to match the scan made at a security checkpoint, for example, against the scans of suspected terrorists to see if there's a match — what's known as a one-to-many check.
Facial scans can also be used to control entry to buildings or computer networks by comparing the image of a person seeking access against the scan taken of that person at an earlier date — a one-to-one check.
Facial-recognition solutions employ the same four-step process that all biometric technologies do: sample capture, feature extraction, template comparison and matching.
The sample capture takes place in the enrollment process, during which the system takes multiple pictures of the face, usually from slightly different angles, to increase the system's ability to recognize the face.
After enrollment, certain facial features are extracted and used to create a template. The specific features extracted vary depending on the type of facial-recognition technology used. No images of faces are stored; instead, the templates consist of numeric codes that are usually encrypted. And many templates can be stored on one system because each is less than 1K in size, compared to between 150K and 300K for a facial image.
When someone logs in using a facial-scan system, the template created upon attempted log-in is compared to a stored template for that person (one-to-one matching) or to a database of stored templates (one-to-many matching).
One-to-one matching is also known as verification. It requires a claimed identity and answers the question, "Is this the subject?" When using a verification system, a person logs in with a personal identification number or other user ID, and the system confirms or denies the identity claim by comparing the scan results. One-to-many matching, on the other hand, is known as identification. It does not require a claimed identity. Instead, a face template is compared to a database of stored templates, and the system attempts to answer the question, "Who is the subject?" Most facial-recognition systems are designed to be robust enough to locate a single template out of thousands or more.
People often find biometric technology — especially facial recognition — disconcerting, if not downright invasive.
Privacy advocates worry about facial recognition because it is the only commonly used biometric that does not require the subject's cooperation. The system can scan a crowd, for example, and search for matches to database images.
Even putting privacy concerns aside, the cooperation of the subject — or lack thereof — can have an impact on performance. IBG points out that verification assumes a cooperative audience, meaning that subjects are motivated to use the system correctly. Noncooperative subjects are unaware — or don't care — that a biometric system is in place, and they make no effort either to be recognized or to avoid recognition.
However, uncooperative subjects actively avoid recognition by, for example, using disguises or taking evasive meas.ures. It's important to keep in mind that facial-recognition systems today are almost entirely incapable of identifying uncooperative subjects.
As with fingerprint recognition and iris scanning, users tend to find facial-scan technology fairly intrusive. According to an IBG study, users rate it slightly more intrusive than hand geometry and signature recognition and significantly more intrusive than voice recognition. The only major biometric tool that users rate more intrusive than facial recognition is retinal scanning.
One problem is that many people dislike having their picture taken. Another is that the face is not traditionally interpreted as an authentication mechanism — it's almost too personal to be scanned and broken down into grids and components. And the fact that a facial- recognition system can take pictures of subjects without their knowledge or consent makes many people — especially privacy advocates — uncomfortable.
Most vendors use one of four primary types of facial-recognition technology. These include feature analysis, eigenface, neural network mapping and automatic face processing.
Visionics Corp., a leading facial- recognition vendor, uses Local Feature Analysis. LFA extracts images of dozens of features from different regions of the face and uses them as building blocks. The types of blocks and their arrangement are used to identify the face. LFA anticipates that the slight movement of a feature located near the mouth, for example, will be accompanied by a relatively similar movement of adjacent features.
Eigenface technology, meanwhile, analyzes two-dimensional, grayscale images to generate a representation of the distinctive characteristics of a face. During the enrollment phase, a subject's face is mapped to a series of numbers, or coefficients. When attempting to verify or identify a face, the current, live template is compared to the stored template to determine coefficient variation. The degree of variance determines whether the system accepts or rejects the individual. Variations of this technology are frequently used as the basis for other facial-recognition methods, including LFA. LFA, however, is better than eigenface at accommodating changes in appearance or expression (smiling vs. frowning, for example).
Neural network mapping uses an algorithm to determine the similarity of the unique global features of a live facial image to a stored image, using as much of the image as possible instead of focusing on specific features. A false match prompts the algorithm to modify the importance (or weight) it gives to certain facial features. Theoretically, this results in an increased ability to identify faces in difficult conditions such as low light levels.
Automatic face processing is not as robust as the other three technologies, but it may be more effective in dimly lit situations. It uses distances and distance ratios between features such as the eyes, end of the nose and corners of the mouth to capture and analyze facial images.
It is extremely important to remember that facial recognition is heavily influenced by environmental factors, most notably lighting conditions. When subjects attempt to log in, the lighting conditions must be similar to the conditions under which their faces were originally scanned. Also, the angle of the face usually must vary only slightly from the stored image, and the facial image cannot be too large or too small.
Agencies that are considering buying a facial-recognition system should consider several factors. One is the type of installation: Will the system be used for surveillance, physical access control, computer access or something else?
Another factor is your existing infrastructure. You should research the system requirements for each product and assess the cost and time involved in implementation. Some systems may integrate well with an existing network, while others require more powerful servers, certain video boards, special cameras or other equipment.
Some vendors have demonstration applications that allow you to see the system in action and test it for yourself. Be sure to ask if this option is available.
For access control, layering facial recognition with a PIN, a password or another biometric tool is always a good idea in case the face system does not work for some reason — especially given its sensitivity to environmental factors.
Below is an overview of the facial-recognition products available from three vendors.
Face ID: An Investigative Tool
Face ID, ImageWare Systems Inc.'s facial-recognition software, is tailored to law enforcement applications. It is an extension of the company's Crime Capture System, which stores and helps retrieve information about criminals and suspects.
Face ID uses Visionics' FaceIt Identification software developer's kit at the kit's lowest level, which allows ImageWare to customize Face ID for specialized applications. Because it uses the Visionics engine, the system recognizes faces using the LFA method described earlier.
Face ID creates two facial templates. The small template, at 88 bytes, is used for fast searches, while a larger, 3,518-byte image is used in conjunction with the small template for more thorough searches.
Face ID is sold as an investigative tool, not an identifying application. Instead of positively identifying an individual from a database of hundreds or thousands of suspects, the system helps winnow the number of suspects to an amount that a person can easily review.
The system can search for possible face matches using a mug shot, a surveillance photo or even a composite image created by an artist based on descriptive information from witnesses. Those types of images can also be stored in the system so they become part of the suspect database.
Face ID compensates for changes in a person's appearance, such as facial hair differences, by weighting areas of the face differently. For example, the eyes, nose and jaw line are heavily weighted, while facial hair is not. This means that the system assigns increased importance to structural areas so that superficial changes do not play a large role in determining a face match.
Search results are displayed like a mug book, with rows of thumbnail photos on the screen. The thumbnail display allows for maximum performance over slower networks.
Users can double click on an image to pull up a record on the suspect. This view displays basic information, such as the person's name, height and weight, along with a larger, higher-resolution color photograph.
Face ID uses the process of elimination to aid in identification. For example, the system might perform a query using a composite image created by an artist. It would then return the closest face matches in the database. Of those matches, the witness would choose the person most resembling the suspected criminal. That image could then be averaged with the composite image to produce yet another set of search results. If necessary, a photo from those results could then be averaged with a previous photo and so on.
When a final image is chosen, it can be altered to add or subtract aspects of appearance, such as adding a hat or changing a hairstyle, which can greatly enhance the identification process.
Face ID uses a client/server architecture that runs on standard PC-based hardware with any Microsoft Corp. 32-bit Windows operating system such as Windows NT, 2000 or XP. It can scale from a single computer to very large installations. According to ImageWare, the system can match 6.5 million faces per minute on a single 1.2 GHz server. Because of its support for wide-area networks, Face ID can be deployed over networks as slow as 56 kilobits/sec.
Current users include the Los Angeles County, Calif., Sheriff's Department, the Indianapolis Police Department and the Arizona Department of Public Safety.
ID-2000: Ready for Action
Like Face ID, ID-2000 by Imagis Technologies Inc. is used primarily for law enforcement and security applications. It can be integrated with other Imagis systems, such as the Computerized Arrest and Booking System.
Imagis uses a combination of 3-D modeling and spectral analysis — a method of analyzing the light signals from the face — to locate a face and normalize it. Normalizing, also called contrast stretching, is an enhancement technique that attempts to improve the contrast in an image. The system then finds values for the pitch, yaw and roll of the head.
Next, ID-2000 uses proprietary algorithms to transform the detection information into an encode array, or template. The template size is 532 bytes — small enough to put on a smart card — and does not take into account sex, race, hair color, skin tone or facial hair.
Like other facial-recognition systems, ID-2000 can search the database using various input images, including scanned photographs, saved electronic image files, live or recorded video and digital video files. It can also use artist- rendered images.
ID-2000 uses the eyes and tip of the nose as "anchor points" when encoding a face. Upon enrollment, the system automatically identifies these three points and marks them on the image. If the software has trouble identifying the anchor points for any reason — for instance, because the subject wears eyeglasses — you can enter a manual encoding mode and place the anchor points on the image yourself by clicking on them.
A search function can link to a database and use text filters to search on known information about the person such as eye color, race, gender, associates or vehicles driven. This information is linked to the subject's enrolled facial template.
ID-2000 can search databases containing a million records in less than five seconds. The images of faces that are possible matches are displayed in a grid on the screen. Users can set a threshold for how closely the retrieved image must match, and results are returned with a confidence number between zero and 100. A confidence number of 100 means the system is 100 percent confident that the image is a match.
The latest version of ID-2000 comes with a new feature called Watch Mode that allows you to connect a camera that can grab frames from a live streaming video and incorporate them into the database or use them to perform a search. You can also set up multiple cameras and direct the system to monitor different cameras at certain times. Any camera is compatible as long as it accommodates the Microsoft Video for Windows file format.
Another feature, called Collate, allows you to set a frame from a live video as a standard, and if the frame changes, an alarm sounds. For example, you could train a camera on an empty room and set that image as the frame; if a person enters the room, an alarm would be activated.
Impressively, ID-2000 can also encode a face from a live video stream. The system finds the anchor points and keeps them trained on the facial image even as the person moves. This surveillance feature is called Identify.
ID-2000 is part of a software developer's kit that contains Microsoft ActiveX controls so a developer can insert ID-2000 into an existing application or build a new application around it. It supports client/server, application server and stand-alone architectures and is compatible with Windows 98, NT, 2000 and XP. The system requires a Video for Windows capture board and 1G of disk space for every 50,000 images.
Because it uses a TCP/IP connection between the client and the server, ID-2000 can be used for wireless applications, such as laptop computers in field units or patrol cars. Imagis says that its proprietary, binary face encode array is unpublished and cannot be hacked.
For access control applications, Imagis recommends using facial recognition as part of a layered biometric approach.
It's Me: A Different Approach
It's Me by VisionSphere Technologies Inc. is a facial-recognition solution designed for computer network access. The product includes VisionSphere's facial-recognition software, called UnMask, and a proprietary USB camera, called MapleSight, that is specially designed for facial recognition. UnMask comes in two flavors: a one-to-one verification system and a one-to-many identification system called UnMask Plus.
UnMask recognizes faces using a proprietary method instead of one of the four technologies described earlier in this article. First, it performs a "liveliness test" that detects eye blinks to determine that the image is that of a live person instead of, say, a photograph. Once a person passes this test, UnMask automatically locates the face and eyes using proprietary search and processing algorithms. The software then normalizes and crops the image to make it more resistant to changes in ambient lighting, size, head rotation, facial expressions and eyeglasses.
Next, UnMask creates what VisionSphere calls a Holistic Feature Code that is unique to each face and based on the way a human brain processes the face. Instead of measuring facial geometry, it captures a holistic view, or overall impression of the face, and converts it into a template.
Different features — such as the eyes, nose and mouth — are not specifically located or measured. However, those features affect the overall image pattern and the resulting template, also called a face code. UnMask's facial templates are less than 1K in size.
The MapleSight camera is a unique feature of this solution and sets It's Me apart from all other facial-recognition systems we've seen. According to VisionSphere, It's Me is the only system on the market that features a camera and software made by the same company and designed specifically for facial recognition.
The patent-pending camera comes with a CMOS sensor and software that compensates, in real time, for changes in ambient lighting conditions, so the image of the face remains stable.
VisionSphere sent us a demonstration version of It's Me so we could test the system ourselves. We received the MapleSight USB camera and a CD containing a sample version of the UnMask software.
Enrollment takes about a minute to complete. The subject should sit between 18 inches and 5 feet from the camera. The system takes 10 pictures of the face at different angles. A voice prompt directs users to look directly at the camera first, and then at nine different areas of the screen, turning the head slightly for each one. The number of enrollment images required can be set to any number between one and 30, but VisionSphere recommends at least 10.
When we tried to verify our identity, It's Me recognized us on the first try with an 89 percent confidence level and on two subsequent attempts with 100 percent confidence levels. This percentage is a measure of how closely the verification image matches the stored image.
Administrators can adjust the threshold that determines the confidence level that must be reached in order to verify a face and can also set a time limit, ranging from 1 second to 15 seconds, for the system to attempt to authenticate a user before timing out.
When we tried to verify our identity wearing clear, frameless glasses, the system authenticated us with confidence levels at 90 percent and above, even though we had not recorded a set of images with glasses. But when we tried it with sunglasses on, the system could not verify us.
Overall, It's Me did a good job of recognizing us when we tried different facial expressions and various head positions. It also did well under slightly different lighting conditions. In fact, the same person can enroll multiple sets of images to accommodate situations in which, for example, the person sits near a window and the lighting changes from day to night.
If the system fails to authenticate a legitimate user, the administrator can click a button to access a Match Result window, which displays the captured image along with an enrolled image and the closest matches stored in the database. The images in the database include previously captured verification images, and in this way, the system "learns" to recognize a person's face over time, similar to the way that speech-recognition software must be "trained."
It's Me can run as a stand-alone system, but VisionSphere is gearing it toward enterprise, client/server applications. The system integrates with Windows 98, 2000 and higher, and Novell Inc.'s Modular Authentication Service and NetWare 6. It is not compatible with Windows NT because that operating system lacks USB support.
PC requirements include an Intel Corp. Pentium II processor or faster, 128M of memory, a 512K cache, a 6M hard drive or larger, and a VGA card.
The bottom line: VisionSphere has created a facial-recognition system that stands out because of its unique features, including the proprietary camera, the Holistic Feature Code and the "trainable" aspect of the software. We were impressed with the ease of setup and use, and also with the system's ability to recognize faces under slightly different lighting conditions, with different facial expressions and at slightly different angles.
Most facial-recognition systems are not out-of-the-box solutions. Instead, they are sold as "engines" that vendors integrate into custom installations or as software developer's kits (SDK) that the customers use to develop applications based on facial recognition.
The vendor works with the customer to develop a system designed for that customer's unique needs, or a customer's engineers can integrate it themselves or build a system around the SDK.
Therefore, we were not able to test a full-fledged facial-recognition solution from the ground up. Instead, VisionSphere Technologies Inc. provided us with a demonstration version of the software and a camera so we could perform hands-on testing.
The other two participating vendors, ImageWare Systems Inc. and Imagis Technologies Inc., walked us through online tutorial demonstrations so we could see how their systems worked.
The three other vendors we contacted — Visionics Corp., Viisage Inc. and BioID AG — told us they were unable to participate. However, ImageWare's Face ID system uses Visionics' FaceIt Identification SDK, giving us a glimpse of the Visionics product.