- Disclosing that advice stems from AI doesn't reduce its corrupting influence on human behavior.
- Even with transparency, users may embrace AI advice for self-interest, raising ethical concerns.
- Experiments help to test the behavioral impact of proposed AI legislation.
by Nils Köbis, Margarita Leib, Rainer Rilke, Marloes Hagens, and Bernd Irlenbusch
Artificial intelligence (AI) is integrated into various aspects of our lives.1 AI is playing a significant role in improving medical diagnoses,2,3 making everyday tasks more efficient, and striving to reduce inequalities in our society.4 However, AI also comes with new dangers. It reproduces existing biases,5 facilitates the spread of misinformation, and can be abused for corrupt purposes.6
The European Union's AI-ACT, the White House's AI Bill of Rights, and coordinated efforts by the G7 show governmental efforts to meet these risks.7 One of the most advanced ones is the European Union’s AI-ACT.8 The AI-ACT classifies AI systems into different risk categories based on how likely they will cause harm. It then suggests tailored regulations for each category of systems to minimize these risks. For generative AI systems such as large language models (LLMs; think ChatGPT, Google’s Bard) the AI ACT mandates transparency. Simply, whenever people encounter or interact with these AI models, the system's output should be clearly identified as originating from the AI, providing greater clarity and awareness to users.
Transparency to Ensure Ethical AI
Transparency, disclosing relevant information to consumers and users, is a popular tool in the regulators' toolbox,9 also beyond the context of AI. Policymakers demand transparency to avoid people harming themselves. Consider the warnings on cigarette packs that highlight the dangers of smoking or the nutritional information on sugary foods and beverages. Transparency is also used to encourage people to reduce harm to others and act more ethically. The guiding principle of these transparency-centered policies is the assumption that access to information empowers individuals to make informed and morally sound decisions. The same logic is now featured in the AI-ACT to reduce the potential for AI systems to steer people toward harmful, unethical actions. The idea is that disclosing that a certain output is generated by an LLM will make them adjust their behavior accordingly.
People are interacting with LLMs ever more frequently. Individuals turn to ChatGPT for a wide array of requests, from seeking help with homework and recipe suggestions to grappling with ethical dilemmas. Such advice, at times, can go sour. Instances made headlines reporting that AI advised a child to insert a penny into a power socket,10 provided instructions on shoplifting and creating explosives,11 and suggested users end their relationships.12 From the user's standpoint, the impact of such guidance can be profound, sometimes resulting in life-altering decisions. For instance, reports have surfaced that a woman divorced her husband following ChatGPT’s advice to do so.13 Already in the 1960s, AI pioneers like Joseph Weizenbaum warned about the risks of people ascribing human qualities to computer-generated advice.14 As these risks are even more pertinent today, regulations demanding transparency have been proposed to counteract them. The pivotal question remains: Does transparency actually work?
Behavioral Experiments to Test the Effect of Transparency
One way to find out is to conduct empirical research on transparency’s effectiveness. Our recent study15 sheds the first light on whether transparency mitigates the ethical risks of people following unethical AI advice. In this study, participants confronted an ethical dilemma: to opt for honesty or to tell a lie to boost their payment. Before making their decision, they were presented with either AI-generated advice or advice written by a human. One group remained unaware of whether the advice originated from AI (specifically the open-source model GPT-J) or a fellow human. Meanwhile, the other group was informed about the advice's source. This setup allows us to empirically examine the effect of transparency.
Let us first examine the results when participants did not know the advice source. Here, people behaved similarly following AI and human advice, which indicates that AI can produce advice that is similar to humans. When AI encouraged participants to be dishonest, they followed suit. To test the effectiveness of transparency, one needs to compare participants who know the advice source and those who do not. If transparency works, participants should follow AI advice to a lesser extent when they know the advice source compared to when they do not know it. However, that is not what the study finds. Instead, the results indicate that people follow AI advice to the same extent when they know the advice is generated by AI and when they do not know it.
The study reveals a crucial finding: Transparency is not a sufficient intervention. This finding calls into question the assumption behind the transparency policy. Namely, people do not adjust their behavior once they know they are advised by AI. When people face a financial temptation to behave in a certain way (e.g., lie for financial profit), they are motivated to obtain information that encourages such behavior. Receiving such encouraging information from an AI system is enough to push people toward self-serving, unethical behavior.
At its core, the essence of transparency policies places the responsibility for alleviating (ethical) harm on the individual.16 Going back to the examples above, it is the individual who is expected to recalibrate their buying habits, change consumption choices, or even stop their habits in response to energy labels, sugar, and calorie content information, and alerts about smoking risks. A similar assumption holds for AI transparency policies. Here, it is the user who engages with AI systems, such as LLMs, who is responsible for adapting their behavior once they know they interact with AI. Take popular AI language models like ChatGPT or Google’s Bard. Such models come with a warning about the AI producing potentially “inaccurate or offensive information,” assuming that users will factor it into their decision-making. Yet, our findings show that such disclaimers will likely not help when someone is motivated to leverage AI output to further their self-interests.
Looking into the future, our relationships with LLMs are rapidly advancing. People who once did not use such technology are now integrating it into their everyday lives, and those who have already embraced LLMs are increasingly forming meaningful “synthetic relationships" with AI models. These meaningful, ongoing relationships entail AI models that “remember” previous conversations and preferences and customize their output to the specific user. As the intricate interplay between AI and humans is ever-growing, it is crucial to understand how people incorporate AI output into their (ethical) decision-making.
1. Rahwan, I. et al. Machine behaviour. Nature 568, 477–486 (2019).
2. Soenksen, L.R. et al. Using deep learning for dermatologist-level detection of suspicious pigmented skin lesions from wide-field images. Sci Transl Med. 13, (2021).
3. Agarwal, N., Moehring, A., Rajpurkar, P. & Salz, T. Combining Human Expertise with Artificial Intelligence: Experimental Evidence from Radiology. (2023) doi:10.3386/w31422.
4. Noy, S. & Zhang, W. Experimental evidence on the productivity effects of generative artificial intelligence. Science 381, 187–192 (2023).
5. Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots. in Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (ACM, 2021). doi:10.1145/3442188.3445922.
7. Habuka, H. The path to trustworthy AI: G7 outcomes and implications for global AI governance. Center for Strategic and International Studies https://www.csis.org/analysis/path-trustworthy-ai-g7-outcomes-and-implications-global-ai-governance (2023).
10. BBC News. Alexa tells 10-year-old girl to touch live plug with penny. BBC (2021).
12. Roose, K. A Conversation With Bing’s Chatbot Left Me Deeply Unsettled. The New York Times (2023).
13. Kessler, A. A woman used ChatGPT to decide whether to leave her husband. 80lv (2023).
14. Tarnoff, B. ‘A certain danger lurks there’: how the inventor of the first chatbot turned against AI. The Guardian (2023).
15. Leib, M., Köbis, N., Rilke, R. M., Hagens, M., Irlenbusch, B. Corrupted by Algorithms? How AI-Generated and Human-Written Advice Shape (DIS)Honesty. The Economic Journal. uead056 (2023).
16. Chater, N. & Loewenstein, G. The i-frame and the s-frame: How focusing on individual-level solutions has led behavioral public policy astray. Behav Brain Sci. 1–60 (2022).