Saturday, May 17, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

Over-training large language models may make them harder to fine-tune

Simon Osuji by Simon Osuji
April 15, 2025
in Artificial Intelligence
0
Over-training large language models may make them harder to fine-tune
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Over-training language models may make them harder to tune
Language models with extensive pre-training can exhibit catastrophic overtraining, where the performance of post-trained models degrades as the pre-training stage is extended. Credit: arXiv (2025). DOI: 10.48550/arxiv.2503.19206

A small team of AI researchers from Carnegie Mellon University, Stanford University, Harvard University and Princeton University, all in the U.S., has found that if large language models are over-trained, it might make them harder to fine-tune. In their paper posted on the arXiv preprint server, the group compared the impact of different amounts of training on a single LLM.

Related posts

Coinbase Will Reimburse Customers Up to $400 Million After Data Breach

Coinbase Will Reimburse Customers Up to $400 Million After Data Breach

May 17, 2025
Is Elon Musk Really Stepping Back from DOGE?

Is Elon Musk Really Stepping Back from DOGE?

May 17, 2025

Over the past couple of years, as AI researchers seek to enhance their products to make them more “intelligent,” many have been driven by the mantra that the more training a model is given, the better the model will be in the end. In this new study, the research team has found some evidence suggesting that there may be a point of diminishing returns with language model training.

The researchers came to this conclusion as they were testing the return when training two different versions of the LLM OLMo-1B. Under one scenario, they trained it using 2.3 trillion tokens, while in the other they used 3 trillion tokens. They then compared the scenarios by testing them with several benchmarks, such as ARC and AlpacaEval. In so doing, they found that the model trained with more tokens actually did worse when tested—up to 3% worse.

Surprised by their findings, they ran more tests and found similar results, suggesting that there is some point at which more training starts to make models less “intelligent.” The research team calls it “catastrophic overtraining,” and suggests it is due to what they describe as “progressive sensitivity.”

They further suggest that as the number of tokens rises, the more fragile a model becomes, which means that fine-tuning, which can be viewed as adding noise, starts to reverse the gains in improvement that were seen prior to the stress point.

Over-training language models may make them harder to tune
Schematic to illustrate how the scaling of the optimal learning rate can affect model evaluations as a function of the pre-training tokens T. Credit: arXiv (2025). DOI: 10.48550/arxiv.2503.19206

To test their theory, they added Gaussian noise to some of the models, and found that doing so led to the same type of performance degradation they had witnessed earlier. They have named the point of no return, the “inflection point.” After that point, they suggest, any further training will reduce the stability of the model, making it more difficult to tune in ways that are useful for a desired set of applications.

The researchers conclude by suggesting that moving forward, developers of LLM models may have to make estimations regarding how much training is enough—or, find other types of methods that will allow for additional training with a more distant inflection point.

More information:
Jacob Mitchell Springer et al, Overtrained Language Models Are Harder to Fine-Tune, arXiv (2025). DOI: 10.48550/arxiv.2503.19206

Journal information:
arXiv

© 2025 Science X Network

Citation:
Over-training large language models may make them harder to fine-tune (2025, April 14)
retrieved 14 April 2025
from https://techxplore.com/news/2025-04-large-language-harder-fine-tune.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

SAAF King Air light transport aircraft loses door

Next Post

Meta Helps MTN Boost WhatsApp Call Quality Across Africa

Next Post
Meta Helps MTN Boost WhatsApp Call Quality Across Africa

Meta Helps MTN Boost WhatsApp Call Quality Across Africa

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Is your IPhone Spying on You?

Is your IPhone Spying on You?

2 years ago
What $100 in Dogecoin Today Could Turn Into

What $100 in Dogecoin Today Could Turn Into

6 months ago
Here’s how machine learning can violate your privacy

Here’s how machine learning can violate your privacy

12 months ago
How to Stream Amazon Prime on Discord? (2024 Guide)

How to Stream Amazon Prime on Discord? (2024 Guide)

1 year ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.