Skip to main content

Verified by Psychology Today


Google’s April Fools’ Pigeon Prank Proves More Than a Joke

We gave a small flock of pigeons a different classification test in real life.

Google’s 2002 April Fools’ Day joke purportedly disclosed that its popular search engine was not actually powered by artificial intelligence, but instead by biological intelligence. Google had deployed bunches of birds, dubbed pigeon clusters, to calculate the relative value of web pages because they proved to be faster and more reliable than either human editors or digital computers.

The joke hinged on the silliness of the premise – but the scenario does have more than a bit of the factual mixed in with the fanciful.

A screenshot of Google's explanation of how PigeonRank supposedly worked.
Source: Google

The prank had taken a page out of 20th-century behaviorist B. F. Skinner’s operant conditioning playbook by allegedly teaching pigeons to peck for a food reward whenever the birds detected a relevant search result.

It also adapted Victorian polymath Francis Galton’s vox populi – or the voice of the people – principle by purportedly putting the web search task to something of a vote. The more the flocks of pigeons pecked at a particular website, the higher it rose on the user’s results page. This so-called PigeonRank system thus rank-ordered a user’s search results in accord with the pecking order of Google’s suitably schooled birds.

More than a decade later, we integrated elements of this spoof into our own serious research project using a real mini-flock of four pigeons. Our research team included a pathologist, a radiologist, and two experimental psychologists.

 e0141357, CC BY
The test chamber provided pigeons with an image to classify for the reward of a food pellet.
Source: PLoS ONE 10(11): e0141357, CC BY

Exploiting the well-established visual and cognitive prowess of pigeons, we taught our birds to peck either a blue or a yellow button on a computerized touchscreen in order to categorize pathology slides that depicted either benign or cancerous human breast tissue samples.

In each training session, we showed pigeons several slides of each type in random order on the touchscreen. Pigeons first had to peck the pathology slide multiple times – this step encouraged the birds to study them. Then the two report buttons popped up on each side of the tissue sample. If the tissue sample looked benign and the pigeons pecked the "benign" report button or if the presented tissue sample looked malignant and the pigeons pecked the "malignant" report button, then they received a food reward. However, if the pigeons chose the incorrect report button, then no food was given.

After two weeks of training, the pigeons attained accuracy levels ranging between 85 and 90 percent correct. Granted, this accomplishment falls short of their reading human text – although time will tell if that too is within the ken of pigeons – but the pigeons were quite able to make such highly accurate reports despite considerable variations in the magnification of the slide images.

We went on to test the pigeons with brand-new images to see if the birds could reliably transfer what they had learned; this is the key criterion for claiming that they'd learned a generalized concept of "benign/malignant tissue samples." Accuracy to the familiar training samples averaged around 85 percent correct, and accuracy to the novel testing samples was nearly as high, averaging around 80 percent correct. This high level of transfer indicates that rote memorization alone cannot explain the pigeon’s categorization proficiency.

 e0141357, CC BY
Pigeons were able to generalize the skill of classifying tissue samples.
Source: PLoS ONE 10(11): e0141357, CC BY

Finally, we put Google’s PigeonRank proposal to the test. With an expanded set of breast tissue samples, we assessed the accuracy of each of four pigeons against the "wisdom of the flock," a technique we termed "flock-sourcing." To calculate these "flock" scores, we assigned each trial a score of 100 percent if three or four pigeons correctly responded, and we assigned a score of 50 percent if two pigeons correctly responded. Three or four pigeons never incorrectly responded.

The accuracy scores of the four individual pigeons were 73, 79, 81 and 85 percent correct. However, the accuracy score of the "flock" was 93 percent, thereby exceeding that of every individual bird. Pigeons thus join people in evidencing better wisdom from crowds. Playing on Galton’s original term, you might call this vox columbae – or the voice-of-the-pigeons principle.

Although all of this may seem to be a bit of feathery fluff, over the past several years our report has resonated across several fields, going beyond pathology and radiology to include the burgeoning realm of artificial intelligence. It has been recognized in several articles including one quoting Geoff Hinton, a key figure behind modern AI and Turing Award winner: “The role of radiologists will evolve from doing perceptual things that could probably be done by a highly trained pigeon to doing far more cognitive things.” In other words, machines may eventually be programmed to match what pigeons can do, leaving the more interesting and challenging tasks to humans.

What began as an elaborate April Fools’ prank has thus proved to be more than a joke. Never underestimate the brains of birds. They’re really brainy beasts.

This piece was originally published in The Conversation.

More from Edward A. Wasserman Ph.D.
More from Psychology Today
More from Edward A. Wasserman Ph.D.
More from Psychology Today