Here’s why turning to AI to train future AIs may be a bad idea
Training large language models on their own data risks model collapse
By Payal Dhar
ChatGPT, Gemini, Copilot and other AI tools whip up impressive sentences and paragraphs from as little as a simple line of text prompt. To generate those words, the underlying large language models were trained on reams of text written by humans and scraped from the internet. But now, as generative AI tools flood the internet with a large amount of synthetic content, that content is being used to train future generations of those AIs. If this continues unchecked, it could be disastrous, researchers say.
Training large language models on their own data could lead to model collapse, University of Oxford computer scientist Ilia Shumailov and colleagues argued recently in Nature.
Model collapse sounds startling, but it doesn’t mean generative AIs would just quit working. Instead, the tools’ responses would move further and further from their original training data. Though sometimes biased, that original data is a decent representation of reality. But as the tools train on their own generated data, the small errors they make add up, their content ultimately losing the nuance of diverse perspectives and morphing into gibberish.
That’s what Shumailov and colleagues found. The team took a pretrained language model, called the OPT-125m, and fed it a bunch of Wikipedia articles to fine-tune its responses. The team then gave this tool a text prompt and asked it to predict what comes next. Its response was fed back into the model for further fine-tuning. When each successive generation was trained with data generated by the previous one, they found that by the ninth generation, the model was spewing nonsense. What had started out as a prompt about 14th century architecture ended up as a list of types of jackrabbits. In another set of experiments, when the team retained some of the original data, model degradation was minor.
This study demonstrates that training AI on its own responses would have serious ramifications, including exacerbating bias and morphing text into nonsense, if left unchecked. Big AI companies do have ways of preventing this type of collapse, but as more people begin to use language models to train their own chatbots and other AIs, there could be consequences.
How could generative AI models collapse?
Language models and generative AI have been around for decades, mostly in computer science labs. But the dominance of chatbots is more recent, starting in November 2022 when ChatGPT was released for public use. A combination of better hardware that can process information in parallel plus the advent of the transformer, a type of neural network, and the availability of trillions of high-quality, human-created datapoints have been key to this dominance.
“What model collapse is suggesting is that perhaps the quality of data [both going in and coming out] is going to be decreasing,” Shumailov says.
To understand why, imagine explaining to a computer program what a cat is, Shumailov says. “We don’t really know how [to do that] … so we give [the LLM] a number of examples [text descriptions] of what a cat is and then we ask the model to learn to define this creature.” The LLM does so without supervision or explicit instruction, by extrapolating from the given set of observations.
But such extrapolation comes with subtle errors. Shumailov likens it to a game of telephone, in which a phrase is whispered from one person to another until it reaches the last person, who then says it out loud. The original phrase often ends up badly mangled because of errors introduced along the way. This makes LLMs hallucinate, generating plausible content that isn’t quite right (SN: 2/1/24).
If such erroneous content is used to train a later version of the model or another model entirely, that content is going to start influencing those models’ learning processes, and eventually “break” them in some way.
What would AI models collapse look like in real life?
Model collapse essentially refers to a shift away from original text used to train the models, says Leqi Liu, an AI researcher at the University of Texas at Austin. One of the reasons for this is the disappearance of the data distribution tails — text that represents low probability events. For example, using the example of cats, the model might become very good at describing furry cats but fail to keep information about hairless ones.
Another example, Liu says, is that people from minority groups may express things differently, and that kind of text will show up less and less, further sidelining data pertaining to marginalized people. That’s the change we’re likely to see as end users. The downstream effect will be AI-generated content not only amplifying bias, as studies show, but also, starting to sound the same. “Naturally, we probably want diverse expressions of ourselves, but if we’re using the same writing assistant, that could reduce that diversity.”
To prevent AIs increasing bias or breaking down and spouting gibberish, it is important to keep track of all data and make sure that prior knowledge (including human-generated text) as well as new knowledge (AI-generated text) is used for training, Liu says. Basically, the idea would be to not train new models with only AI-generated data. “Another approach could be that we explicitly make sure to capture the tail of the distribution.” Those hairless cats, for example.
Given that companies marketing AI tools heavily check for data drift, any problems would be noticed early and could be fixed. Therefore, the possibility of model collapse is unlikely to affect downstream users, Shumailov says. But individuals trying to build models on a smaller scale would certainly be affected and need to be aware of the risk.