What can the government do about big data fairness?

Big data analytics are used in making determinations for credit, employment, and even criminal sentencing. Should the government make sure they are used fairly?

abstract head representing big data

Big data has a fairness problem.

Data and software-driven analysis of data is drives decisions in housing, credit and even criminal justice. At a Ford Foundation conference dubbed Fairness by Design, officials, academics and advocates discussed how to address the problem of encoding human bias in algorithmic analysis. The White House recently issued a report on the topic to accelerate research into the issue.

"This is a conversation we can't afford to have at a regular pace. It's a conversation that is moving at the clip and pace of technology, which is unbelievable, so we have to match that pace," said U.S. Chief Data Scientists DJ Patil.

Algorithmic decision-making tools often segment and make assumptions about people based on what they buy or which ads they are most likely to click on.

The FTC released two studies on how big data is used to segment consumers into profiles and interests.

"Data brokers compile thousands, tens of thousands, and hundreds of thousands of bits of information about each and every one of us," said Julie Brill, a former Federal Trade Commissioner and who is now a partner at Hogan Lovells.  "Most are rather benign," she said, but some segments had troubling names like "urban scramblers" and "ethnic second-city dwellers."

While those names have likely been changed, Brill said that their presence suggests that software designers are encoding categories of race and class into their systems.

There's a flip side, however. Brill said big data can be used to bring more people into the lending system through alternative scoring. If someone doesn't have a credit card, other factors like rent or utility payments could be considered.

U.S. CTO Megan Smith said the government has been "creating a seat for these techies," but that training future generations of data scientists to tackle these issues depends on what we do today. "It's how did we teach our children?" she said. "Why don't we teach math and science the way we teach P.E. and art and music and make it fun?"

Patil said when we think about what training looks like for big data programmers, ethics should be an integral part.

"Ethics is not just an elective, but some portion of the main core curriculum."

Law enforcement applications

Federal officials also noted a need for a data quality push in law enforcement.

"We just have really, really bad crime data right now," said Roy L. Austin, Jr., a senior domestic policy adviser to President Obama. "In this day and age where I have better fantasy football numbers than I have crime numbers, it's a big problem."

When police and community relations in Ferguson, Mo., came to a boiling point in 2014 after the deadly officer-involved shooting, the Obama administration formed a task force to convince police departments to voluntarily release at least three data sets of their crime statistics.

"Data became a central part of the conversation," the U.S. Digital Service's Clarence Wardell III said.

Austin said it is up to police departments to voluntarily collect crime data and report it out; there is no mandate from Congress to do so. He said some agencies that made their data transparent noticed the extent of the diversity in their own agencies and saved money on FOIA requests. Fifty-six police agencies covering 40 million people have participated in the police data initiative, including seven of the 10 largest ones. However, there are 18,000 police departments around the country, so there's a ways to go before all that data is made public.

Additionally, the proliferation of body-worn camera in police departments is creating new accountability, but also new problems, including data storage and privacy issues. To save time and resources within police departments, Austin said they are looking at using machine-learning to sift through data from the videos. In terms of how to balance transparency and privacy issues, Austin said those standards should be up to communities, not the federal government.

"This work is not easy work. It's not easy for a lot of these departments," Wardell said. "It is a new vocabulary and a new language."