Friday, May 16, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

Reinforcement learning boosts reasoning skills in new diffusion-based language model d1

Simon Osuji by Simon Osuji
May 1, 2025
in Artificial Intelligence
0
Reinforcement learning boosts reasoning skills in new diffusion-based language model d1
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


d1 uses using reinforcement learning to enhance the reasoning capabilities of dLLMs
Log Probability Estimation in diffu-GRPO. Credit: arXiv (2025). DOI: 10.48550/arxiv.2504.12216

A team of AI researchers at the University of California, Los Angeles, working with a colleague from Meta AI, has introduced d1, a diffusion-large-language-model-based framework that has been improved through the use of reinforcement learning. The group posted a paper describing their work and features of the new framework on the arXiv preprint server.

Related posts

We Hand-Picked the 24 Best Deals from the 2025 REI Anniversary Sale

We Hand-Picked the 24 Best Deals from the 2025 REI Anniversary Sale

May 16, 2025
Congress pushes GPS tracking for every exported semiconductor

Congress pushes GPS tracking for every exported semiconductor

May 16, 2025

Over the past couple of years, the use of LLMs has skyrocketed, with millions of people the world over using AI apps for a wide variety of applications. This has led to an associated need for large amounts of electricity to power data centers running the computer-intensive applications. Researchers have been looking for other ways to provide AI services to the user community. One such approach involves the use of dLLMs as either a replacement or complementary approach.

Diffusion-based LLMs (dLLMs) are AI models that arrive at answers differently than LLMs. Instead of taking the autoregressive approach, they use diffusion to find answers. Such models were originally used to generate images—they were taught how to do so by adding overwhelming noise to an image and then training the model to reverse the process until nothing was left but the original image.

Using this approach for text involved converting letters or words to tokens as an analog for pixels. The result was a model that used masks as an analog for noise to slowly erase tokens until there was nothing left but mask characteristics, then training the model to reverse the process until there was nothing but tokens. The advantage of this approach is that it can require far less computing power than LLMs.

d1 uses using reinforcement learning to enhance the reasoning capabilities of dLLMs
Across four math and logical reasoning tasks, d1-LLaDA, which undergoes SFT followed by our proposed diffu-GRPO, consistently outperforms the base LLaDA-8BInstruct model. Credit: arXiv (2025). DOI: 10.48550/arxiv.2504.12216

Holding up the use of dLLMs has been their inferior reasoning abilities. That is where the team in California comes in. They have been working to add reinforcement learning (where models learn through the use of rewards) to a dLLM as a way to improve its reasoning ability.

To build d1, the team added a two-step process. The first step involved supervised fine-tuning of the training dataset using high-quality data. The second makes use of reinforcement learning by adding an algorithm called diffu-GRPO, which uses math principles to make high-level estimates, along with what the team calls “random prompt masking.”

Testing of d1 has thus far shown the approach works—models using the framework outscored some math and logical reasoning benchmarks. The research team suggests their framework is ready for testing by other entities who may choose to adapt their AI models to incorporate the changes they are suggesting.

More information:
Siyan Zhao et al, d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning, arXiv (2025). DOI: 10.48550/arxiv.2504.12216

Journal information:
arXiv

© 2025 Science X Network

Citation:
Reinforcement learning boosts reasoning skills in new diffusion-based language model d1 (2025, April 30)
retrieved 30 April 2025
from https://techxplore.com/news/2025-04-boosts-skills-diffusion-based-language.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Experts see rise of powerful non-state groups as US retreats from global stage

Next Post

IHG to introduce Kimpton to UAE with Dubai hotel opening in 2026

Next Post
IHG to introduce Kimpton to UAE with Dubai hotel opening in 2026

IHG to introduce Kimpton to UAE with Dubai hotel opening in 2026

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Pentagon’s Lack of Transparency Risks Hypersonic Weapons Development: GAO

Pentagon’s Lack of Transparency Risks Hypersonic Weapons Development: GAO

10 months ago
Bitcoin mining hashprice stays flat despite higher difficulty: Report

Bitcoin mining hashprice stays flat despite higher difficulty: Report

2 months ago
Goldman Sachs Tests Enterprise Blockchain for Tokenized Assets

Goldman Sachs Tests Enterprise Blockchain for Tokenized Assets

1 year ago
Shiba Inu (SHIB) Up 170% in March, Targets $0.00009 Pre-Halving

Shiba Inu (SHIB) Up 170% in March, Targets $0.00009 Pre-Halving

1 year ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.