- Large language models (LLMs) like ChatGPT can produce effective summaries of text.
- ChatGPT was able to provide a good summary of the replication crisis in psychology and proposed solutions, an area the author is an expert in.
- Using large language models to produce basic summaries of text can speed up writing, but it misses elements of style in technical writing.
There have been a flood of articles about new artificial intelligence (AI) models that can generate text—including entire essays—for you. I’ve worked and written a bit about AI in the past, but I really didn’t want to write another piece about ChatGPT. They inevitably feel a bit breathless and awe-struck, as if this new generation of large language models (LLMs) overwhelms the typical journalistic snark. I know, in broad strokes, how these models detect and reproduce patterns they’ve encoded through millions of training examples (as one commenter I saw put it: the entire internet, to a first approximation). I don’t believe it’s thought or sentience to be able to encode and reproduce even the very complex patterns seen in human language. (For an account of what it means to be alive, I’d recommend people read the work of philosopher Francisco Varela.)
And yet. Here I am, writing about ChatGPT. The thing is, it’s very impressive. Playing with it recently, I asked ChatGPT to generate the kind of content I regularly produce for this blog: an essay on the replication crisis in psychology and proposed solutions. Here’s (some of) what I got:
What is the replication crisis in psychology?
ChatGPT: The replication crisis in psychology refers to a growing concern that a significant portion of research in psychology may not be replicable or reliable. This means that when other researchers attempt to reproduce the results of a particular study, they may not get the same results, or they may find weaker effects than those reported in the original study.
This crisis was first identified in the early 2010s, when several high-profile studies in psychology were found to be difficult or impossible to replicate. Since then, numerous studies have attempted to replicate findings from a range of psychological subfields, and the results have been mixed.
The description goes on from there, but it’s pretty good. It’s a concise summary that hits most of the major points I’d make if asked to give someone a quick primer. It’s a bit colorless, reading like a Congressional Research Service report, but it’s certainly sufficient for getting the facts across. Here’s more:
What solutions have been proposed for the replication crisis?
ChatGPT: Various solutions have been proposed to address the replication crisis in psychology, and these solutions are aimed at improving the reliability and replicability of psychological research. Here are some of the proposed solutions:
- Increase sample sizes: Many studies suffer from small sample sizes, which can lead to unreliable and non-replicable findings. Increasing sample sizes can help to improve statistical power and reduce the risk of false positive results.
- Pre-register studies: Researchers can pre-register their studies, which involves specifying their research questions, hypotheses, and methods before conducting the study. This can help to reduce the risk of p-hacking and other questionable research practices, as the researchers cannot change their methods or analyses after the fact.
- Open data and materials: Sharing data and materials openly can help to increase transparency and facilitate replication. This allows other researchers to verify the results and conduct their own analyses.
- Replication studies: Conducting replication studies can help to determine the reliability of previous findings. Researchers can also conduct meta-analyses, which combine the results of multiple studies to determine the overall effect size.
- Improve statistical methods: There is a need to improve the statistical methods used in psychology research, including the use of appropriate effect size estimates, Bayesian methods, and multilevel modeling.
- Improve peer review: Peer review is an essential part of the scientific process, and there is a need to improve the quality and rigor of peer review in psychology. This includes ensuring that reviewers have sufficient expertise in the relevant areas and are able to provide constructive feedback.
- Foster a culture of transparency and openness: The replication crisis highlights the need for a culture of transparency and openness in psychology research. This includes encouraging researchers to share data, materials, and methods, as well as to acknowledge and report null results.
That’s not bad either. It sounds like someone who has visited the Center for Open Science website, or has attended the Metascience conference, and is providing the main talking points. I wouldn’t mind seeing this list posted in the labs of social psychologists across a few universities I know.
Being friends with educators, one of the first concerns about this technology I heard discussed was the potential that it will make it too easy for students to cheat. Having used it, I understand the concern. It took minutes to generate that text, whereas it would have taken me over an hour (plus a decade of becoming familiar with the area!) to produce that by myself. If all I need to do is get information about a topic across—like in a class essay—then I would be able to skip doing the work myself.
On the other hand, I wouldn’t be proud of the writing generated here. I like good writing, and I like trying to polish my own writing. This doesn’t meet my standard. What I could see doing, though, is using this as the basis for an article that I then polish up further. This has already been recommended as an approach for teachers: have students generate a first draft of the essay using ChatGPT, then work on the craft of writing by editing it. This seems like a sensible solution for some situations.
However, I think there’s something that’s still missing about the intellectual quality of the writing that might require more than just reworking language. Good writing is good thinking, and that implies that you can’t really take the deep thought out of the process. For example, while I might have produced the list of solutions above three years ago when I started blogging about this topic, I’d now be much more inclined to discuss the problems with theory in psychology and the lack of formal modeling. That’s something that often gets short shrift in the mainstream discourse about replication, but that I’ve learned to appreciate through interacting with other scientists online (sometimes in response to something I posted here!). Recognizing those good ideas, and the flaw in the consensus narrative, is something that ChatGPT might not be capable of. Instead, as a pattern recognizer and reproducer, it will spit back out the “common sense” of the field, without any deeper analysis.
That’s not to say that ChatGPT can’t elicit feelings. I’ll end this post by asking the AI to reflect on what role my blogging—with nearly 100 posts and over 500,000 views—has had on the topic I’ve devoted so much of my professional life to:
What role has Alexander Danvers played in the replication crisis?
ChatGPT: As an AI language model, I do not have access to information about individuals who are not public figures or who are not well-known in the scientific community. Therefore, I am not aware of anyone named Alexander Danvers who has played a role in the replication crisis in psychology. It is possible that this person is not a well-known figure in the field or has not made a significant contribution to addressing the replication crisis.