Skip to main content

Verified by Psychology Today


Racist Robots? Part of the Problem, Part of the Solution

Stereotypes and prejudices can be readily learned by both people and AI devices.

Key points

  • Social biases and prejudices are destructive. They shred the very fabric of this pluralistic society and fuel the fires of bigotry and disorder.
  • Both AI and humans can acquire biases against a variety of social groups.
  • Recent efforts have been made to remove bias from textual and visual information that might negatively impact public opinion.

The simplest form of associative learning is Pavlovian conditioning. By means of this basic behavioral process, humans and other organisms come to appreciate the reliable relationships that hold between the myriad events in the world—what has been called the "causal texture" of the environment. The key feature of Pavlovian conditioning is that the frequent conjunction of two stimuli can be sufficient to establish a durable connection between them.

We generally think of Pavlovian conditioning as involving rudimentary reflexes like the blinking of the eye or the secretion of saliva from glands in the mouth. However, the association between stimuli can also arouse emotional or affective reactions, some of them quite subtle and outwardly unobservable by others.

Many of these subtle affective responses—so-called "implicit biases"—reflect a range of societal stereotypes and prejudices. We would certainly never wish to admit to having such negative feelings toward others; indeed, we ourselves may not even be aware of having them. Nevertheless, highly effective behavioral techniques can reveal these otherwise latent biases. The Implicit Association Test (IAT) is one such technique.

The IAT asks people to press one of two computer keys in order to sort words from four categories: for example, names of women, names of men, words connected with the home, and words connected with the office. When the categories of woman and home are assigned to one key and the categories of man and office are assigned to a second key, people are far faster to respond than when woman and office share one key and when man and home share the second key. This marked disparity in response speed suggests the presence of an implicit gender bias or stereotype. Since 1998, the IAT has provided unique glimpses into many of the unpleasant beliefs that we may harbor.

Of course, no one is born holding these beliefs and biases—they must be learned. One of the principal revelations of Pavlovian conditioning is that some profoundly maladaptive behaviors, such as phobias, may arise from uniquely personal experiences. Might many of society’s most harmful racist and sexist biases be similarly acquired?

The answer seems to be "no." These and other social biases—e.g., age, disability, weight, and wealth—are far too widespread to have been the result of so many different individuals sharing so many common experiences. So, where might such prevalent prejudices originate?

Research on AI and bias

One fascinating clue comes from an unlikely source. Consider these jarring headlines from a spate of recent news releases: “Even artificial intelligence can acquire biases against race and gender.” “Robots trained on AI became racist and sexist.” “Rise of the racist robots: How AI is learning all our worst impulses.” “Artificially intelligent robots perpetuate racist and sexist prejudice.” How could these cold, calculating machines possibly have become biased and prejudiced?

The answer turns out to be both straightforward and revealing. The "brains" of most AI devices use algorithms that have been programmed to detect covariations of the very sort that our own brains have evolved to detect amidst the noise of innumerable unrelated stimuli.

For computers, different tests for bias have been devised: One is the Word-Embedding Association Test (WEAT). By examining hundreds of billions of English-language words on the internet, some pairs of words proved to be more strongly embedded within a 10-word reading frame than chance alone would expect, much as the IAT uses human reaction times to reveal implicitly biased associations. Critically, the WEAT and the IAT have been found to strongly agree with one another on a wide variety of human biases, including racial prejudice.

Such malignant stereotypes are not limited to verbal stimuli. Visual stimuli and their verbal captions have also been found to reveal strong racist and sexist biases.

In short, across a broad range of verbal and visual stimuli, the same biases can be detected—by both artificial devices and human adults. To riff off Shakespeare’s famous lines from his drama, Julius Caesar, the fault is not in ourselves, but in our stars—here, in the innumerable words we’ve read and pictures we’ve seen every day of our lives.


There are important implications of these results. First, no matter what our individual experiences may have been—good, bad, or innocuous—cultural forces most assuredly overwhelm us with stereotypic information. Given this fact, even the most promising training programs that have so far been devised will have an uphill task of moderating these biases; as soon as those efforts end, we will once again be subjected to the same cultural forces that promoted the biases in the first place.

Second, and following from the first point, many biases have proven to be exceptionally durable. As measured by extensive textual analysis, some have endured for as long as 200 years.

Third, there are nonetheless encouraging signs that the severity of some of these biases may be declining; implicit attitudes measured at the population level by the IAT have progressively and durably changed from 2007 to 2020—in the direction of decreasing prejudice. This research looked at anti-gay bias, race bias, skin tone bias, age bias, disability bias, and body weight bias. Interestingly, the bias that showed the greatest reduction was anti-gay bias. Race bias and skin tone bias also fell, but to a lesser degree. However, age bias, disability bias, and body weight bias did not decline. The cultural factors behind these results are doubtlessly complex and require further study.

Finally, recent efforts have been launched to more rapidly de-bias the textual and visual information that might most negatively impact public opinion. Several approaches are being deployed to achieve more equitable representations of individuals and groups in the output of AI programs. Among them are efforts to minimize the impact of learning algorithms on sensitive social attributes as well as to craft "bias-detecting" programs to root out and eliminate biases as soon as they are detected.

Our social systems have evolved over a very long period of time, making the challenges of overcoming existing biases exceptionally challenging. These biases shred the very fabric of our pluralistic society and fuel the fires of bigotry and disorder. Recognizing that AI may inadvertently reflect and perpetuate those biases, we can now more optimistically project that innovative efforts might more quickly help to minimize the toxic effects of bias and prejudice on our individual and cultural behaviors.

We must nonetheless be clear-eyed as to the possible effectiveness of many such AI reprogramming efforts. Implicit racial bias has now been detected in children as young as 4 years of age. This raises the possibility that implicit biases can be acquired by even younger children watching television or being read to—a prospect that has yet to be addressed by AI reprogramming projects.

Tony Greenwald and Mahzarin Banaji graciously helped me prepare this story.

More from Edward A. Wasserman Ph.D.
More from Psychology Today
More from Edward A. Wasserman Ph.D.
More from Psychology Today