Every few months or years it seems like there’s at least a few news cycles devoted to how Artificial Intelligence is an existential threat to humanity.
Sometimes it’s a general warning from Stephen Hawking, or Elon Musk founding a new company. Pop culture also makes sapient AIs into the villian in things like I, Robot; Terminator; Age of Ultron; or even in media aimed more at family and young children like WALL-E. These are existential threats that are truly dangerous but entirely theoretical. Much like an asteroid destroying life on earth, we should continue researching and preparing but should allocate resources according to the likelihood of the event occurring. We spend more money and effort on healthcare and reducing pollution than we do on preventing asteroid hits. This is because they are present, widespread, and tractable problems. An asteroid hit is certainly possible, but the possibility is slim and healthcare and pollution are currently existing concerns we can address.
This fear mongering about a potential sapient AI distracts from the very real and currently present threats from a non-sapient AI. Let me break down some of these terms as most experts use them. AI is a bit of a generic term. You rarely hear engineers use it when addressing other engineers. They tend to use phrases like Machine Learning (ML) or Natural Language Processing (NLP). The dangerous, sapient AI of movies and comics is usually referred to as Artificial General Intelligence (AGI). This would be a single intelligence that could theoretically learn any general thought process or task the way a human could.
AGI would have sapience and thus a consciousness (or something much like it). Consciousness is something not well understood by humanity. Psychiatrists, psychologists, and philosophers can’t even agree on what it fundamentally is. Since we cannot consistently define consciousness this makes predicting its creation or emergence difficult at best. While the resources of companies, think tanks, ethicists, and more are dedicated to digging into the issues an AGI would present to humanity, those resources are consequently not being devoted to the ethics of modern AI (ML, NLP, etc.). They are getting some attention, but usually just to identify problems after the fact without providing solutions, as we saw in Frances Haugen’s Facebook leaks. Unless there’s a dramatic event like Huagen’s whistleblowing, there’s much less focus on this in the news and media, as the algorithm dictating the speed of your Instagram notifications is inherently less compelling than a James Spader-voiced megalomaniac robot.1
The premise behind the risks of modern AI is simple, even if the specifics are endlessly complicated. Modern AIs are variations on ML. ML is done by having an ML program that processes an unbelievably huge amount of data until it can make generalized conclusions. If the data contains inaccuracy, bias, or other issues at scale those issues will be reflected in the AI. Not only reflected but exacerbated. One of the benefits of ML is that it can learn unbelievably fast. AlphaGo went from “conception” to being the best Go player in the world in a handful of years. This is a great example of the benefits of AI’s scale and speed. But that same capability meant it took less than a week for Microsoft’s AI chatbot “Tay” to become a Nazi. Facebook’s chatbot acted similarly.
These instances are extra concerning when looked at in combination with other AI applications. A common application of ML is facial recognition. Facebook has been having people tag their friends in photos for over a decade, and for most of that time has had users specify which face was whose. This gave them an unbelievably large data set to train their AI with. Once the AI was advanced enough (i.e. it had trained on a large enough data set) it was able to begin tagging faces automatically. Many users likely didn’t know they were helping train an AI to identify themselves.
And it’s not just Facebook. Google, like many tech companies, is based in the Bay area and sources a lot of their training data accordingly. This means that the bias in their training is reflected in their products. For example one time their photo recognition algorithm classified two Black Americans as gorillas. Despite this, only a few years later Google’s recognition tech was being used by the Pentagon to analyze drone footage. (We can’t be sure it was the same software in both cases but there seems to have been no fundamental changes to how Google approached AI during that time). This example is specific to Google but is emblematic of modern tech company AIs. Ethical considerations seem so far removed from applications as to be nonexistent
Putting decision making in the hands of an AI is dangerous as they are difficult or impossible to audit. AI algorithms are already and increasingly being used in various facets of life: facial recognition, location tracking, sentiment evaluation, gait analysis, and financial responsibility.
These algorithms are used by banks when deciding whether or not to make a loan. You see this in practice in lots of situations, such as when women get lower credit offerings.
Those algorithms are used by police and state surveillance organizations, as well as corporate surveillance collecting data which is often sold to governments. This enables wrongful arrests to happen. They’re also used in hiring and retention decisions at companies, basing decisions on AI-measured emotions.
These systems allow enormous scale and reach for products that companies scale without ensuring that auditing can be scaled commensurately. Companies that have error rates around 1% or even 0.1% is a pretty big issue when their customer base measures in the hundreds of millions or billions. (0.1% of Facebook’s users is 2.9 million people). Examples of this aren’t hard to find. Even a small dive into YouTubers will show many creators who used music completely legally getting their content removed or demonetized due to erroneous automated copyright claims.2
It seems like an (alleged) slide from an IBM presentation in 1979 applies here
There are significant problems with how we use AI today. I really like AI and think it’s a cool technology I’m excited to see evolve! Photo tagging alone has enabled finding so many old photos of people in my giant mess of a photo archive. In more weighty areas, AI is enabling better cancer fighting, despite that AI being created to sell pastries!
But its marketing has surpassed its abilities. Attempting to remove bias from data sets, or make an AI that can identify and not internalize that bias is a huge unsolved problem. Researchers are making progress but it’s early days. While I’m excited to see how this area evolves, we should be cautious. We should allow wide-ranging research, but strongly restrict applications until we have substantial testing proving their safety. Tech has a history of ignoring long tails of users, but for AIs that increasingly run society we can’t afford to be that lax.
First and foremost, we should require AI data to be public. Companies don’t have to share their proprietary datasets, but should have to share enough data and information about how they train their AI so that independent researchers can validate their claims.
One significant consideration of AI is how it enabled new uses of data that were difficult to foresee. While even a decade ago privacy advocates were advising against facial recognition and similar, few warned against deep fakes where a short video of yourself could be used to mimic you in false situations that looked real. Whether having your face believably on a body in a porn scene, or the President making a speech that never happened, it was difficult to foresee those possibilities. As such, we should be careful about Personally Identifiable Information (PII) going forward. Personal data should be treated like nuclear waste. We shouldn’t stop its generation as the product is (generally) worth it, but we should make sure that handling of that data is highly controlled to avoid contaminating the environment.
An example from history we could adapt to AI is the Nutrition Facts label found on almost all food available for sale. While not required at your local farmers market, companies selling food at scale must include basic facts about the product you’re going to ingest. The overhead to the companies is minimal (they should already know what’s in the food they’re selling) and the presentation to consumers is simple and straightforward. Similar labels could be applied to AI. A few options on what they should disclose could be how their AI drives engagement, how their AI training dataset was acquired and sanitized, what effects the algorithm has on the final product, and what data is gathered to continually train the AI product.
Any company or app generating a feed via algorithm, such as Twitter, Facebook, Youtube, and TikTok but also less obvious uses such as LinkedIn and Reddit should be held responsible for what they publish. As I call out in my piece on free speech these companies are presently more analogous to newspaper editors than they were when our current computer laws were conceived. With almost endless content to choose from, deciding what to amplify is inherently an editorial decision and the company should be held accountable for the editorial decisions of the AI, the same as it would for a human editor.3
Certain levels of AI driven products should be age gated the same as alcohol, nicotine, and R rated movies. As a country we tend to allow people to make their own decisions about the content and substances they consume. But due to the developing bodies and brains of under-21s we restrict things that are perfectly allowable once you turn 21 (or 18). When certain AI-driven products are highly correlated with, for example, increased eating disorders and depression in teenage girls, that product and those like it should be age gated to prevent this outsized harm.
AI is promising, but young and poorly understood. And the fears around it driven by celebrities and media are far removed from the actual problems AI is already causing. While we wait worriedly for our phones to one day tell us “I’m afraid I can’t do that Dave,” we hand over huge swaths of society to the much less sapient, but much more dangerous, ML algorithms we already have.
-
Yes, I know I’m the only person that really enjoys the Age of Ultron movie. Don’t @ me. ↩︎
-
To be fair the DMCA, overzealous takedown reports, and the legal fuzziness of “fair use” all play a part here, but the AI component is a significant contributor. ↩︎
-
This would require substantial changes to Section 230 in the US. I get more into the weeds on that in my free speech piece. ↩︎