
Teaching AI to explore its surroundings is a bit like teaching a robot to find treasure in a vast maze—it needs to try different paths, but some lead nowhere. In many real-world challenges, like training robots or playing complex games, rewards are few and far between, making it easy for AI to waste time on dead ends.
To address this challenge, researchers at Nanjing University and UC Berkeley devised an interesting way to teach AI: clustered reinforcement learning (CRL). Instead of wandering around aimlessly or only chasing big scores, this method sorts similar situations into “clusters.” It rewards the AI for trying new things and for building on past successes.
The research is published in the journal Frontiers of Computer Science.
“By grouping experiences and balancing curiosity with proven success, we’ve given AI a more human-like way to learn,” says Prof. Wu-Jun Li, the project’s lead researcher.
The two-step magic: Clustering experiences and rewarding wins
So, how does CRL pull off these wins? Instead of treating every state as unique and unconnected, CRL groups similar states into clusters using a technique called K-means. Each cluster is then analyzed to measure two things: how often it’s been visited (novelty) and how good the average outcome is (quality).
CRL assigns bonus rewards based on these two factors—encouraging the agent to explore areas that are not only new but also likely to yield good results. This contrasts with traditional methods that chase only novelty, often leading the agent into unproductive areas.
Results and impact: Fast learning, real-world utility
By blending curiosity with outcome-based guidance, CRL allows AI to learn faster and with fewer mistakes. It achieved top performance across multiple standard benchmarks, including robotic control tasks and difficult Atari games, outperforming several state-of-the-art methods. What’s more, CRL can be easily added to existing AI systems as a modular enhancement.
This makes it especially promising for high-stakes domains like autonomous driving, energy optimization, and intelligent scheduling—where safe, sample-efficient learning is essential.
By combining simple clustering with light reward tweaks, CRL opens the door to safer, faster, and more reliable AI training. As intelligent machines move into our everyday lives—from warehouse robots to city-street navigation—methods like this will help them learn quickly, avoid costly mistakes, and need less human babysitting.
More information:
Xiao Ma et al, Clustered Reinforcement Learning, Frontiers of Computer Science (2024). DOI: 10.1007/s11704-024-3194-1
Provided by
Higher Education Press
Citation:
Clustering-based approach accelerates AI learning in robotics and gaming (2025, May 30)
retrieved 30 May 2025
from https://techxplore.com/news/2025-05-clustering-based-approach-ai-robotics.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.