Friday, May 16, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

Algorithm based on LLMs doubles lossless data compression rates

Simon Osuji by Simon Osuji
May 15, 2025
in Artificial Intelligence
0
Algorithm based on LLMs doubles lossless data compression rates
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


A powerful lossless data compression algorithm based on LLMs
Image comparing the lossless compression rates of LMCompress with the traditional state-of-the-art methods and the large-model-based method that was proposed independently by a DeepMind-Meta&INRIA team. The comparison is done on four types of data: image, video, audio, and text. It shows that LMCompress consistently outperforms the others on all data types. Note that the DeepMind result on video is not available. Credit: Li et al.

People store large quantities of data in their electronic devices and transfer some of this data to others, whether for professional or personal reasons. Data compression methods are thus of the utmost importance, as they can boost the efficiency of devices and communications, making users less reliant on cloud data services and external storage devices.

Related posts

Can the US really enforce a global AI chip ban?

Can the US really enforce a global AI chip ban?

May 16, 2025
The Best Ergonomic Mouse (2025), Tested and Reviewed

The Best Ergonomic Mouse (2025), Tested and Reviewed

May 16, 2025

Researchers at the Central China Institute of Artificial Intelligence, Peng Cheng Laboratory, Dalian University of Technology, the Chinese Academy of Sciences and University of Waterloo recently introduced LMCompress, a new data compression approach based on large language models (LLMs), such as the model underpinning the AI conversational platform ChatGPT.

Their proposed method, outlined in a paper published in Nature Machine Intelligence, was found to be significantly more powerful than classical data compression algorithms.

“In January 2023, when I taught a Kolmogorov complexity course at the University of Waterloo, I reflected on the idea that compression is understanding,” Ming Li, senior author of the paper, told Tech Xplore. “In other words, if you understand something, you can express it succinctly; and if you can express something in very short expression or in a few words, then you must understand it.

“In this paper: we proved that compression implies the best learning/understanding. The opposite was proved in one of our other papers, which was a precursor to this work, while another paper by Google DeepMind independently obtained our initial results.”

A powerful lossless data compression algorithm based on LLMs
Image illustrating the key insight of the team’s paper. The insight that understanding is equivalent to compression bridges a cognitive concept (comprehension) and a technological concept (compression). It sheds light on developing understanding-based technologies, say, semantic communication. Credit: Li et al.

As part of their recent study, Li and his colleagues set out to demonstrate that the better models grasp data, the better they can summarize it and compress it. This idea dates back to 1948, specifically to Claude Shannon’s renowned mathematical theory of communication.

“Shannon essentially proposed that if you understand the data to be communicated, then you can compress it, or in other words, shorten communication time,” explained Li. “For 80 years, this research idea challenge remained open, until AI and large language models came along. Our paper essentially proposes that if a large language model can understand data well, it must be able to guess what we plan to write, which allows us to compress the data significantly better than the best classical lossless data compressors (e.g., bzip for text, JPEG-2000 for images).”

The basic idea behind the researchers’ data compression algorithm is that if an LLM knows what a user will be writing, it does not need to transmit any data, but can simply generate what the user wants them to transmit on the other end (i.e., on a receiver’s device). When Li and their colleagues tested their proposed approach, they found that it at least doubled compression rates for different types of data, including texts, images, videos and audio files.

“This is amazing in the sense that after 80 years of research, if you just improve a lossless compression algorithm by even 1%, this is already remarkable, and we were able to double compression rates,” said Li. “LMCompress is a compression algorithm using large models (large language model for texts, large image model for images, etc.). It compresses texts more than two times better than classical algorithms, images and audios two times better, and video slightly less than two times better. Therefore, when you transmit data, you can go approximately two times faster.”

This recent paper by Li and his colleagues could inform future efforts aimed at developing increasingly advanced data compression techniques, inspiring other researchers to leverage LLMs. Moreover, the team’s LMCompress algorithm could soon be improved further and deployed in real-world settings.

“We demonstrated that understanding equals compression, and we think this is of crucial importance,” added Li. “We also paved the way for a new era of compressing data using LLMs. We think in the future, when these large models are on our cell phones and everywhere, our method of compressing data will replace the classical ones (e.g., .zip files). In our next studies, we also plan to use our methodology to compare large models and detect plagiarism.”

More information:
Ziguang Li et al, Lossless data compression by large models, Nature Machine Intelligence (2025). DOI: 10.1038/s42256-025-01033-7

© 2025 Science X Network

Citation:
Algorithm based on LLMs doubles lossless data compression rates (2025, May 14)
retrieved 15 May 2025
from https://techxplore.com/news/2025-05-algorithm-based-llms-lossless-compression.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

UK Sends More Decoys to Ukraine to Fool Russia on Arms Shipments

Next Post

Egypt launches specialized tax units to boost business confidence

Next Post
Egypt launches specialized tax units to boost business confidence

Egypt launches specialized tax units to boost business confidence

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

New database features 250 AI tools that can enhance social science research

New database features 250 AI tools that can enhance social science research

11 months ago
India & UAE Looking For More Ways to Ditch USD

Jim O’Neill Predicts the Key to Dethroning the US Dollar

2 years ago
Russian Oil Exports to India Rise 2,200%, Saves Them $7 Billion

Russian Oil Exports to India Rise 2,200%, Saves Them $7 Billion

1 year ago
Ethereum (ETH) Price Prediction For 2024

Ethereum (ETH) Price Prediction For 2024

1 year ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.