Law-abiding citizens may be dragged into criminal investigations due to the database’s alarmingly high levels of false positivesContinuing the slow creep towards the ubiquitous “Big Brother” style surveillance of George Orwell’s 1984, the U.S. Federal Bureau of Investigation (FBI) revealed on Monday that its Next Generation Identification (NGI) system had achieved “full operational capability”.I. Fully OperationalThe new effort ties together multiple sources of biometric information — most notably mugshots for facial recognition and fingerprints.  The FBI brags in a press release:

As part of NGI’s full operational capability, the NGI team is introducing two new services: Rap Back and the Interstate Photo System (IPS). Rap Back is a functionality that enables authorized entities the ability to receive ongoing status notifications of any criminal history reported on individuals holding positions of trust, such as school teachers.Law enforcement agencies, probation and parole offices, and other criminal justice entities will also greatly improve their effectiveness by being advised of subsequent criminal activity of persons under investigation or supervision. The IPS facial recognition service will provide the nation’s law enforcement community with an investigative tool that provides an image-searching capability of photographs associated with criminal identities. This effort is a significant step forward for the criminal justice community in utilizing biometrics as an investigative enabler.

It’s taken the FBI over half a decade to construct its system.  Work on the NGI began in 2006, with a Phase I pilot version launching in early 2011.  In announcing the project, the FBI wrote:

Next Generation Identification is not… A tool to expand the categories of individuals from who the fingerprints and biometric data may be collected, nor will it change existing legal authorities.It doesn’t threaten individual privacy. As required with any federal system, the FBI is doing Privacy Impact Assessments on what information will be collected, how it will be shared, how it will be accessed, and how the data will be securely stored…all in an effort to protect privacy.

The NGI was an expensive project costing taxpayers billions, much of which went to a variety of high profile contractors, including International Business Machines, Corp. (IBM), BAE Systems plc. (LON:BA), and Lockheed Martin Corp. (LMT).  The lucrative payday for military-espionage corporate special interests might be justified, but the question is whether this program is a more limited effort aimed at criminals, or whether it might be the next coming of the U.S. National Security Agency‘s (NSA) Orwellian PRISM program.

FBI NGI architecture
The NGI’s backend is driven by IBM supercomputers.

Some aspects of the NGI are certainly praiseworthy and draw little controversy.  For example, it has reduced the time to process high priority criminal ten-fingerprint submissions from 2 hours down to 10 minutes — an order of magnitude speedup.

FBI next-gen fingerprinting
The NGI is paired with the agency’s next-generation fingerprinting technologies.

The FBI’s full legacy criminal fingerprint database has as many as 100 million fingerprints in it.  But only roughly 2 million are stored in this special high-speed database, designed to identify “dangerous” suspects, such as known terrorism affiliates, sex offenders, and fugitives.


The database may also be expanded to include palmprints, an emerging form of biometrics.  However, as with the high-priority database, the palm database would likely be reserved for select groups of suspects.

II. Poor Quality Images of Criminals May Lead to False Flagging of Law-Abiding Citizens

The more contentious aspects of the next generation biometrics criminal database are the facial recognition and advanced biometrics bits.  In addition to facial images, the FBI is also reportedly storing images of iris and identifying marks (scars and tattoos) to help identify persons of interest, both law-abiding and otherwise.

It’s hard to deny that there may be some benefits to the FBI’s increased ability to identify faces.  The FBI’s database of roughly 100 million fingerprints and its large collection of criminals’ DNA has offered key breaks in many cases over the years.

But groups such as the Electronic Frontier Foundation (EFF) are already voicing concern over a number of aspects of the NGI’s facial recognition components.  One concern is that while most of the database’s photos of current and former criminals, a small but increasing minority of its images is of law-abiding citizens.  As these two collections (criminal suspects and citizens with clean records) are run through the same identification algorithms, it raises the prospect of innocent citizens being unnecessarily implicated in criminal investigations.

Writes the EFF:

NGI will allow law enforcement at all levels to search non-criminal and criminal face records at the same time. This means you could become a suspect in a criminal case merely because you applied for a job that required you to submit a photo with your background check.

While mistaken identification is of course a common problem in a non-digital context, the NGI could greatly increase it by offering up faulty tools.  But how are the tools faulty and who’s to blame?  The answer arguably lies in the states.

FBI NGI detection
The size of the database in records has skyrocketed, but poor data quality may lead to false positives.

So far twenty-six states — a little over half the states in the Union — have signed on to participate in the facial recognition program. The other states haven’t — likely fearing civil liberty issues.  The FBI set forth a series of guidelines to participating states, but it basically got its images in whatever form the state deemed fit.

A hint at how bad the data quality may be comes in the “Face Report Card”, which the FBI published in a special more in-depth effort with the state of Oregon.

In this publication, it reports that Oregon provided it with 14,408 photos over the review period in 2011.  Of these, most were deemed unacceptable for a variety of reasons.  First, the photos were of too low a resolution.  The program requests that images be at least 0.75 megapixels (less than a smartphone photo).  But most of the photos submitted by the state of Oregon were even lower resolution than that — perhaps VGA quality images.  Further, many were deemed problematic due to non-ideal lighting, background, and interference.

It’s unclear just how many of the NGI’s images are these kind of poor quality shots.  In 2012 the database housed 13.6 million images of 7 to 8 million individuals.  By 2013 the database grew to 15 million images and by 2015 it’s expected to further expand to 52 million facial images.  The latest metric indicate that on a daily basis roughly 55,000 new facial images are added to the database and “tens of thousands” of searches are conducted by the FBI and the “18,000 law enforcement agencies and other authorized criminal justice partners” (mostly state, local, and tribal police) on the growing database of images.

III. Civilian Contractors are in for a Headache

A particularly glaring concern is that many of the best images may actual come from non-criminals.  The FBI says it expects to have 46 million criminal images by 2015, but also 4.3 million “civilian” images — pictures of law-abiding citizens.

FBI NGI by states
Roughly half of states are giving the FBI’s facial recognition efforts a helping hand. [Image Source: EFF]

Technically the FBI appears to be keeping its process of not expanding biometrics to new groups, as the “civilian” images largely come from groups like federal employees or contractors who already were required to submit fingerprints to the government.  But what is concerning is that in some cases the high-quality face shots of these law abiding citizens may be compared to millions of low quality images of criminals.  Such a system might almost be guaranteed to create false positives.

But the FBI tries to obfuscate the issue with double-speak saying in effect that the system doesn’t make determinations so it can’t have false positives.  The EFF describes:

Because the system is designed to provide a ranked list of candidates, the FBI states NGI never actually makes a “positive identification,” and “therefore, there is no false positive rate.” In fact, the FBI only ensures that “the candidate will be returned in the top 50 candidates” 85 percent of the time “when the true candidate exists in the gallery.”It is unclear what happens when the “true candidate” does not exist in the gallery—does NGI still return possible matches? Could those people then be subject to criminal investigation for no other reason than that a computer thought their face was mathematically similar to a suspect’s? This doesn’t seem to matter much to the FBI—the Bureau notes that because “this is an investigative search and caveats will be prevalent on the return detailing that the [non-FBI] agency is responsible for determining the identity of the subject, there should be NO legal issues.”

The question becomes if the tool only produces a true positive detection rate of 85 percent and is at its worst accuracy-wise when it comes to criminal photos (which reviews indicated were unacceptably low quality images for a variety of reasons); is the database going to violate due process by leading to the harassment of law abiding citizens?

The EFF doesn’t have a very favorable view of the tool, writing:

Even though FBI claims that its ranked candidate list prevents the problem of false positives (someone being falsely identified), this is not the case. A system that only purports to provide the true candidate in the top 50 candidates 85 percent of the time will return a lot of images of the wrong people.

Is the database more trouble than it’s worth?

IV. What the FBI Isn’t Telling Us

That question grows tougher to answer amid accusations that the FBI is not being forthright about how many civilian records are in its dataset.  If the EFF is correct it is very possible that you may be in the search space, even if you’ve never applied for credentials at a federal agency or done other work-related background screenings that would place you in the FBI’s data set.

The first place you might find yourself is in the vaguely defined categories in the FBI set itself.

Close to a million additional facial images of law-abiding civilians could also be in the database by 2015, under the “Special Population Cognizant” (SPC) (750,000 images) and “New Repositories” (215,000 images) categories.  The FBI has been vague about exactly who falls under these groups, but a 2007-era agency document [PDF] unearthed by the EFF seems to indicate that the SPC group will be used as an arbitrary grab-bag which federal partner agencies can use to create groups of civilian or criminal images they feel are relevant to their investigations.  For example, a federal agency might include civilian pictures from their contractors’ keycards as part of their submission.

Because of these poorly defined groups the percentage of non-criminal (civilian) images in the database could be as high as 10 percent or as low as 8 percent — in the set the FBI is acknowledging, at least.  Either way, some may be surprised to find themselves in the database and potentially unnecessarily ensnared in FBI investigations due to erroneous matches.

But there’s more.  There’s a second set you may belong to.  And this set may be much bigger.

The EFF also warns that the contractor responsible for the facial recognition algorithm — MorphoTrust (formerly L-1 Identity Solutions) — may also effectively search other large federal and state databases in addition to those detailed by the FBI.  MorphoTrust is responsible for the driver’s license databases at 35 of the 50 state Departments of Motor Vehicles (DMVs).  It also provides a facial recognition database for the U.S. Department of Defense (DoD) and yet another database to the U.S. State Department.  The State Department database is the largest officially disclosed government facial recognition database in the world, with 244 million images of over 100 million people.

NGI datasets

It is known that [PDF] the DoD shares its facial recognition data with the FBI and it is not believed that this is included in the 52 million image total.  Similar share may occur with the state DMVs and with the State Department.  The EFF complains:

The FBI failed to release records discussing whether MorphoTrust uses a standard (likely proprietary) algorithm for its face templates. If it does, it is quite possible that the face templates at each of these disparate agencies could be shared across agencies—raising again the issue that the photograph you thought you were taking just to get a passport or driver’s license is then searched every time the government is investigating a crime.The FBI seems to be leaning in this direction: an FBI employee email notes that the “best requirements for sending an image in the FR system” include “obtain[ing] DMV version of photo whenever possible.”

In other words, the database of faces used by the FBI may only be the tip of the iceberg, a criminal subset of the greater search space.  The true searchable dataset of faces may be primarily civilians, which raises serious questions why the FBI is accessing that data — or if it’s not accessing it, why it isn’t making that clear to the public.


There’s strong evidence that the NGI is tied to the U.S. Department of Homeland Security‘s (DHS) BOSS project, whose goal is to be able to publicly identify every American in public via facial recognition.

And due process issues aside, this influx of civilian records would seemingly make the job of picking out criminals in the already poor state-submitted photo database even harder.

V. Database May Cover Over 100 Million Americans

It’s possible these datasets are not searchable by the FBI, but the lack of transparency, at the bare minimum, is glaring.  The FBI was supposed to conduct regular “Privacy Impact Assessments to discuss and brainstorm solutions to such issues.  But its last Privacy Impact Assessment was filed in 2008 — more than a half decade ago.  As a result of this blackout, it’s unclear what exactly the FBI’s “fully operational” database truly represents.

FBI NGI bigger and better
Bigger, as in “Big Brother”?

The EFF states that the worse case scenario may indeed not be too far off the mark.  Its initial investigation indicates that as many as 100 million + civilians — a third of law-abiding Americans — may have their facial images stored in the database, assigned a searchable “Universal Control Number” just like photos of criminals.  The EFF writes:

EFF received these records in response to our Freedom of Information Act lawsuit for information on Next Generation Identification (NGI)—the FBI’s massive biometric database that may hold records on as much as one third of the U.S. population. The facial recognition component of this database poses real threats to privacy for all Americans.

But threat or no threat, Americans have little recourse unless they can convince the courts that the program is unconstitutional (good luck with that) or, more likely, convince Congress to more clearly and narrowly define its scope.  At present Congress has failed to adopt any sort of legislation restricting what kinds of civilian biometrics can be collected and whether those biometrics can be searched in a criminal investigation.

Boston bomber
The FBI tried to use facial recognition to ID the Boston bombing suspects, but the system failed.  Will it be more useful for harassing the populace? [Image Source: FBI/Salon]

 As a result, if you are an American, you might find yourself pulled in for questioning by police in the near future simply because your photo looked vaguely like a blurry VGA photo of a known criminal.  And as the number of such innocent mistakes grow, so too does the potential for abuse as law enforcement receives a convenient excuse to pull in and harass whoever they want be it a political rival or an ex-lover.

And moreover, your taxpayer money will be spent on these mistakes — be they innocent and malicious.  You may ultimately be paying taxes to falsely implicate yourself in a criminal investigation.  It’s easy to see why the EFF believes that it’s cause for concern.
