How AI-powered bots could drive the conversation on pending federal regs
- By Derek B. Johnson
- Dec 24, 2019
In 2017, bots flooded the Federal Communications Commission's online public comments system with millions of fake missives in support of ending Net Neutrality. Two years later, an experiment by a college senior found that simple Artificial Intelligence tools make it easier than ever to fool humans and computers alike, distorting what is supposed to be a critical avenue of public feedback into another tool of manipulation.
In a paper published this month in Technology Science, Harvard College senior Michael Weiss argues that emerging technology is now capable of developing "highly convincing natural language generation that..makes it nearly impossible to distinguish whether online speech originated from a person or a computer program." Such speech, dubbed "Deepfake text," demonstrates how federal agencies' reliance on online feedback for its rules and regulations has "become highly vulnerable to automated manipulation by motivated actors."
The FCC incident involved fake comments generated using unsophisticated "search and replace" techniques, wherein the same base comment is repeated over and over again with several key words changed each time. Because of this, it was relatively easy for researchers to spot and separate them from the real, organic feedback.
Out of the 22 million comments submitted during the Net Neutrality debate, 17 million (or about 77%) supported repealing the Obama-era rules. When researchers stripped out the likely fakes, only 800,000 authentic responses remained, and they told a dramatically different story about public sentiment. Researchers found the vast majority of the authentic comments supported net neutrality, as high as 99% in some cases.
Machine learning tools can generating nearly infinite variations of fake speech at scale. Using a natural language processing framework developed by OpenAI, a bot program and a proxy server, Weiss turned his experiment towards a public comment website on Medicaid.gov dealing with a proposed waiver for Idaho residents, where it spit out more than a thousand "highly relevant" and unique 75-100 word comments.
Most of the comments opposed the waiver, but others were supportive or neutral, helping to make the illusion more realistic. They ranged from sentiments that residents would be buried in red tape and assertions that the new rules would not meaningfully improve employment among recipients to arguments that the waiver was immoral and would create a new costly government program. There were even comments from bots using their own fake family experiences make their case. Many display completely different writing styles, attitudes and insight.
In all, the fake text made up more than half of the 1,810 comments submitted for the proposal, at which point Weiss voluntarily withdrew them. At a certain point the site began blocking comments from one of the IP addresses, but the others were able to continue their submissions undetected.
In a related survey, 108 humans who had already passed a competency test for spotting bot speech were presented with 26 comments and charged with determining which were created by humans and which were bot generated. The results: their collective accuracy amounted to 49%, or about the same as a random guess.
"This project shows that deepfake comments can be submitted to and accepted by a federal public comment website at scale and, once submitted, cannot be distinguished from other comments by human inspection," Weiss wrote.
The findings point to a potentially ominous future for how the federal government processes feedback from the public online. As Weiss points out, most federal agencies design their comment systems for convenience, with few if any protections in place to prevent spamming by bots.
An investigation by the Senate Homeland Security and Governmental Affairs Committee looking at 14 agencies' processes for submitting public comments found many of the same problems, and It's unclear what policymakers can do to address the problem.
Blocking of IP addresses can be evaded relatively easily and the more aggressive an agency is, the more it risks doing the very thing it's fighting against: drowning out legitimate voices from citizens. Both the Senate investigation and Weiss recommend Implementing CAPTCHA technologies, while other options like two-factor authentication requirements could help, though they're far from foolproof.
Weiss argues that rather than a single permanent solution, it will likely be a long-term battle between technologies that detect and obfuscate such behavior. Still, it may be the best path with the least collateral blowback.
"One could imagine a smorgasbord of policy big sticks with threats and criminal penalties," Weiss wrote. "But society seems better off playing the technology cat-and-mouse game than risking draconian policies that may drive the ability to actually witness imbalances and fix them."
Derek B. Johnson is a senior staff writer at FCW, covering governmentwide IT policy, cybersecurity and a range of other federal technology issues.
Prior to joining FCW, Johnson was a freelance technology journalist. His work has appeared in The Washington Post, GoodCall News, Foreign Policy Journal, Washington Technology, Elevation DC, Connection Newspapers and The Maryland Gazette.
Johnson has a Bachelor's degree in journalism from Hofstra University and a Master's degree in public policy from George Mason University. He can be contacted at [email protected], or follow him on Twitter @derekdoestech.
Click here for previous articles by Johnson.