• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

Using generative AI to diversify virtual training grounds for robots

Simon Osuji by Simon Osuji
September 29, 2025
in Artificial Intelligence
0
Using generative AI to diversify virtual training grounds for robots
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Using generative AI to diversify virtual training grounds for robots
Credit: arXiv (2025). DOI: 10.48550/arxiv.2505.04831

Chatbots like ChatGPT and Claude have experienced a meteoric rise in usage over the past three years because they can help you with a wide range of tasks. Whether you’re writing Shakespearean sonnets, debugging code, or need an answer to an obscure trivia question, artificial intelligence (AI) systems seem to have you covered. The source of this versatility? Billions or even trillions of textual data points across the Internet.

Related posts

Gear News of the Week: Google’s Pixel 10a Arrives Soon, and Valve Delays Its Steam Hardware

Gear News of the Week: Google’s Pixel 10a Arrives Soon, and Valve Delays Its Steam Hardware

February 7, 2026
Moltbook, the Social Network for AI Agents, Exposed Real Humans’ Data

Moltbook, the Social Network for AI Agents, Exposed Real Humans’ Data

February 7, 2026

That data isn’t enough to teach a robot to be a helpful household or factory assistant, though. To understand how to handle, stack, and place various arrangements of objects across diverse environments, robots need demonstrations. You can think of robot training data as a collection of how-to videos that walk the systems through each motion of a task.

Collecting these demonstrations on real robots is time-consuming and not perfectly repeatable, so engineers have created training data by generating simulations with AI (which don’t often reflect real-world physics) or tediously handcrafting each digital environment from scratch.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Toyota Research Institute may have found a way to create the diverse, realistic training grounds robots need. Their “Steerable Scene Generation” approach creates digital scenes of things like kitchens, living rooms, and restaurants that engineers can use to simulate lots of real-world interactions and scenarios.

Trained on more than 44 million 3D rooms filled with models of objects such as tables and plates, the tool places existing assets in new scenes, then refines each one into a physically accurate, lifelike environment. The method is posted on the arXiv preprint server.

Steerable Scene Generation creates these 3D worlds by “steering” a diffusion model—an AI system that generates a visual from random noise—toward a scene you’d find in everyday life. The researchers used this generative system to “inpaint” an environment, filling in particular elements throughout the scene.






You can imagine a blank canvas suddenly turning into a kitchen scattered with 3D objects, which are gradually rearranged into a scene that imitates real-world physics. For example, the system ensures that a fork doesn’t pass through a bowl on a table—a common glitch in 3D graphics known as “clipping,” where models overlap or intersect.

How exactly Steerable Scene Generation guides its creation toward realism, however, depends on the strategy you choose. Its main strategy is “Monte Carlo Tree Search” (MCTS), where the model creates a series of alternative scenes, filling them out in different ways toward a particular objective (like making a scene more physically realistic or including as many edible items as possible). It’s used by the AI program AlphaGo to beat human opponents in Go (a game similar to chess), as the system considers potential sequences of moves before choosing the most advantageous one.

“We are the first to apply MCTS to scene generation by framing the scene generation task as a sequential decision-making process,” says MIT Department of Electrical Engineering and Computer Science (EECS) Ph.D. student Nicholas Pfaff, who is a CSAIL researcher and a lead author on a paper presenting the work on GitHub. “We keep building on top of partial scenes to produce better or more desired scenes over time. As a result, MCTS creates scenes that are more complex than what the diffusion model was trained on.”

In one particularly telling experiment, MCTS added the maximum number of objects to a simple restaurant scene. It featured as many as 34 items on a table, including massive stacks of dimsum dishes, after training on scenes with only 17 objects on average.

Steerable Scene Generation also allows you to generate diverse training scenarios via reinforcement learning—essentially, teaching a diffusion model to fulfill an objective by trial-and-error. After you train on the initial data, your system undergoes a second training stage, where you outline a reward (or basically a desired outcome with a score indicating how close you are to that goal). The model automatically learns to create scenes with higher scores, often producing scenarios that are quite different from those it was trained on.

Users can also prompt the system directly by typing in specific visual descriptions (like “a kitchen with four apples and a bowl on the table”). Then, Steerable Scene Generation can bring your requests to life with precision. For example, the tool accurately followed users’ prompts at rates of 98% when building scenes of pantry shelves and 86% for messy breakfast tables. Both marks are at least a 10% improvement over comparable methods like MiDiffusion and DiffuScene, respectively.

The system can also complete specific scenes via prompting or light directions (like “come up with a different scene arrangement using the same objects”). You could ask it to place apples on several plates on a kitchen table, for instance, or put board games and books on a shelf. It’s essentially “filling in the blank” by slotting items in empty spaces, but preserving the rest of a scene.







Credit: Massachusetts Institute of Technology

According to the researchers, the strength of their project lies in its ability to create many scenes that roboticists can actually use. “A key insight from our findings is that it’s okay for the scenes we pretrained on to not exactly resemble the scenes that we actually want,” says Pfaff. “Using our steering methods, we can move beyond that broad distribution and sample from a ‘better’ one. In other words, generating the diverse, realistic, and task-aligned scenes that we actually want to train our robots in.”

Such vast scenes became the testing grounds where they could record a virtual robot interacting with different items. The machine carefully placed forks and knives into a cutlery holder, for instance, and rearranged bread onto plates in various 3D settings. Each simulation appeared fluid and realistic, resembling the real-world, adaptable robots that Steerable Scene Generation could help train one day.

While the system could be an encouraging path forward in generating lots of diverse training data for robots, the researchers say their work is more of a proof-of-concept. In the future, they’d like to use generative AI to create entirely new objects and scenes, instead of using a fixed library of assets. They also plan to incorporate articulated objects that the robot could open or twist (like cabinets or jars filled with food) to make the scenes even more interactive.

To make their virtual environments even more realistic, Pfaff and his colleagues may incorporate real-world objects by using a library of objects and scenes pulled from images on the Internet and using their previous work on Scalable Real2Sim. By expanding how diverse and lifelike AI-constructed robot testing grounds can be, the team hopes to build a community of users thatll create lots of data, which could then be used as a massive dataset to teach dexterous robots different skills.

“Today, creating realistic scenes for simulation can be quite a challenging endeavor; procedural generation can readily produce a large number of scenes, but they likely won’t be representative of the environments the robot would encounter in the real world. Manually creating bespoke scenes is both time-consuming and expensive,” says Jeremy Binagia, an Applied Scientist at Amazon Robotics who wasn’t involved in the paper.

“Steerable Scene Generation offers a better approach: Train a generative model on a large collection of pre-existing scenes and adapt it (using a strategy such as reinforcement learning) to specific downstream applications. Compared to previous works that leverage an off-the-shelf vision-language model or focus just on arranging objects in a 2D grid, this approach guarantees physical feasibility and considers full 3D translation and rotation, enabling the generation of much more interesting scenes.”

“Steerable Scene Generation with Post Training and Inference-Time Search provides a novel and efficient framework for automating scene generation at scale,” says Toyota Research Institute roboticist Rick Cory SM ’08, Ph.D. ’10, who also wasn’t involved in the paper. “Moreover, it can generate ‘never-before-seen’ scenes that are deemed important for downstream tasks. In the future, combining this framework with vast internet data could unlock an important milestone towards efficient training of robots for deployment in the real world.”

More information:
Nicholas Pfaff et al, Steerable Scene Generation with Post Training and Inference-Time Search, arXiv (2025). DOI: 10.48550/arxiv.2505.04831

Journal information:
arXiv

Provided by
Massachusetts Institute of Technology

Citation:
Using generative AI to diversify virtual training grounds for robots (2025, September 29)
retrieved 29 September 2025
from https://techxplore.com/news/2025-09-generative-ai-diversify-virtual-grounds.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

The D Brief: Two mass shootings; Troops to Portland?; Trump to join Hegseth-brass confab; Arrested by ICE for “how they look”; And a bit more.

Next Post

What’s The Tax-Adjusted Return?

Next Post
What’s The Tax-Adjusted Return?

What’s The Tax-Adjusted Return?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

AI boom leads to record costs on US grid and calls for new plants

AI boom leads to record costs on US grid and calls for new plants

7 months ago
Glenfiddich Gives Water for Life to Rural Matatiele

Glenfiddich Gives Water for Life to Rural Matatiele

11 months ago
Two paintings that were stolen from Peruvian church are returned

Two paintings that were stolen from Peruvian church are returned

2 years ago
Denmark to open embassies in three African countries

Denmark to open embassies in three African countries

1 year ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.