Modern large language models (LLMs) might write beautiful sonnets and elegant code, but they lack even a rudimentary ability to learn from experience.

Researchers at Massachusetts Institute of Technology (MIT) have now devised a way for LLMs to keep improving by tweaking their own parameters in response to useful new information.

The work is a step toward building artificial intelligence models that learn continually—a long-standing goal of the field and something that will be crucial if machines are to ever more faithfully mimic human intelligence. In the meantime, it could give us chatbots and other AI tools that are better able to incorporate new information including a user’s interests and preferences.

The MIT scheme, called Self Adapting Language Models (SEAL), involves having an LLM learn to generate its own synthetic training data and update procedure based on the input it receives.

“The initial idea was to explore if tokens [units of text fed to LLMs and generated by them] could cause a powerful update to a model,” says Jyothish Pari, a PhD student at MIT involved with developing SEAL. Pari says the idea was to see if a model’s output could be used to train it.

Adam Zweiger, an MIT undergraduate researcher involved with building SEAL, adds that although newer models can “reason” their way to better solutions by performing more complex inference, the model itself does not benefit from this reasoning over the long term.

SEAL, by contrast, generates new insights and then folds it into its own weights or parameters. Given a statement about the challenges faced by the Apollo space program, for instance, the model generated new passages that try to describe the implications of the statement. The researchers compared this to the way a human student writes and reviews notes in order to aid their learning.

The system then updated the model using this data and tested how well the new model is able to answer a set of questions. And finally, this provides a reinforcement learning signal that helps guide the model toward updates that improve its overall abilities and which help it carry on learning.

The researchers tested their approach on small and medium-size versions of two open source models, Meta’s Llama and Alibaba’s Qwen. They say that the approach ought to work for much larger frontier models too.

The researchers tested the SEAL approach on text as well as a benchmark called ARC that gauges an AI model’s ability to solve abstract reasoning problems. In both cases they saw that SEAL allowed the models to continue learning well beyond their initial training.

Pulkit Agrawal, a professor at MIT who oversaw the work, says that the SEAL project touches on important themes in AI, including how to get AI to figure out for itself what it should try to learn. He says it could well be used to help make AI models more personalized. “LLMs are powerful but we don’t want their knowledge to stop,” he says.

SEAL is not yet a way for AI to improve indefinitely. For one thing, as Agrawal notes, the LLMs tested suffer from what’s known as “catastrophic forgetting,” a troubling effect seen when ingesting new information causes older knowledge to simply disappear. This may point to a fundamental difference between artificial neural networks and biological ones. Pari and Zweigler also note that SEAL is computationally intensive, and it isn’t yet clear how best to most effectively schedule new periods of learning. One fun idea, Zweigler mentions, is that, like humans, perhaps LLMs could experience periods of “sleep” where new information is consolidated.

Still, for all its limitations, SEAL is an exciting new path for further AI research—and it may well be something that finds its way into future frontier AI models.

What do you think about AI that is able to keep on learning? Send an email to [email protected] to let me know.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here