Thursday, June 5, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

AI system can envision an entire world from a single picture

Simon Osuji by Simon Osuji
December 20, 2024
in Artificial Intelligence
0
AI system can envision an entire world from a single picture
0
SHARES
2
VIEWS
Share on FacebookShare on Twitter


AI system can envision an entire world from a single picture
Three panorama representations that can be transformed into one another. Credit: arXiv (2024). DOI: 10.48550/arxiv.2412.09624

Johns Hopkins computer scientists have created an artificial intelligence system capable of “imagining” its surroundings without having to physically explore them, bringing AI closer to humanlike reasoning.

Related posts

ICE Quietly Scales Back Rules for Courthouse Raids

ICE Quietly Scales Back Rules for Courthouse Raids

June 4, 2025
AI churns out funnier memes, but people still deliver the biggest laughs

AI churns out funnier memes, but people still deliver the biggest laughs

June 4, 2025

The new system—called Generative World Explorer, or GenEx—needs only a single still image to conjure an entire world, giving it a significant advantage over previous systems that required a robot or agent to physically move through a scene to map the surrounding environment, which can be costly, unsafe, and time-consuming. The team’s results are posted to the arXiv preprint server.

“Say you’re in an area you’ve never been before—as a human, you use environmental cues, past experiences, and your knowledge of the world to imagine what might be around the corner,” says senior author Alan Yuille, the Bloomberg Distinguished Professor of Computational Cognitive Science at Johns Hopkins.

“GenEx ‘imagines’ and reasons about its environment the way humans do, making educated decisions about what steps it should take next without having to physically check its environment first.”

GenEx uses sophisticated world knowledge to generate multiple possibilities of what might exist beyond the visible image, assigning different probabilities to each scenario rather than making a single definitive guess. This ability to mentally map surroundings from limited visual data is crucial for many real-world applications, including in scenarios such as disaster response. For instance, rescue teams could use a single surveillance image to help explore hazardous sites from afar without risk to humans or valuable equipment.

“This technology can also improve navigation apps, assist in training autonomous robots, and power immersive gaming and VR experiences,” says lead author Jieneng Chen, a Ph.D. student in computer science.






Credit: JHU Center for Language and Speech Processing

From a single image, GenEx generates a realistic, synthetic virtual world where AI agents can navigate and make decisions through reasoning and planning. The agent needs only a view of its current scene, a direction of movement, and the distance to traverse. As demonstrated in the animation below, the agent can move forward, change direction, and explore its environment with unlimited flexibility.

And unlike the dreamlike AI world exploration apps now gaining popularity—such as Oasis, an AI-generated Minecraft simulator—GenEx’s environments are consistent. This is because the model was trained on large-scale data with a technique called “spherical consistency learning,” which ensures that its predictions of new environments fit within a panoramic sphere.

“We measure this by having GenEx navigate a randomly sampled closed path, returning to the origin in a fixed loop,” Chen says. “Our goal was to make the start and end views identical, thus ensuring consistency in GenEx’s world modeling.”

While this consistency isn’t unique to GenEx, the research team says it is the first and only generative world explorer to empower AI agents to make logical decisions based on new observations about the world they’re exploring in a process the computer scientists call “imagination-augmented policy.”

For example, say you are driving and the light ahead is green, but you notice that the taxi in front of you has come to an abrupt, unexpected stop. Getting out of your car to investigate would be unsafe, but by imagining the scene from the taxi driver’s perspective, you can come up with a possible reason for their sudden stop: maybe an emergency vehicle is approaching—and you should make way, too.

“While humans can use other cues like sirens to identify this kind of situation, current AI models developed for autonomous driving and other similar tasks only have access to image and language inputs, making imaginative exploration necessary in the absence of other multimodal information,” Chen says.







Rendering of an AI model making an observation-based decision. Credit: Whiting School of Engineering

The Hopkins team evaluated the consistency and quality of GenEx’s output against standard video generation benchmarks. The researchers also conducted experiments with human users to determine if and how GenEx could augment their logic and planning abilities and found that users made more accurate and informed decisions when they had access to the model’s exploration capabilities.

“Our experimental results demonstrate that GenEx can generate high-quality, consistent observations during an extended exploration of a large virtual physical world,” Chen says. “Additionally, beliefs updated with the generated observations can inform an existing decision-making model, such as a large language model agent, and even human users to make better plans.”

Joined by Tianmin Shu and Daniel Khashabi—both assistant professors of computer science—and undergraduate student TaiMing Lu, Yuille and Chen will incorporate real-world sensor data and dynamic scenes for more realistic, immersive planning scenarios.

Bloomberg Distinguished Professor of Computer Vision and Artificial Intelligence Rama Chellappa and Cheng Peng, an assistant research professor in the Mathematical Institute for Data Science, will help curate the real-world sensor data.

The cross-disciplinary project, which involves computer vision, natural language processing, and cognitive science, marks a significant achievement toward achieving humanlike intelligence in embodied AI, Yuille says.

More information:
Taiming Lu et al, GenEx: Generating an Explorable World, arXiv (2024). DOI: 10.48550/arxiv.2412.09624

Journal information:
arXiv

Provided by
Johns Hopkins University

Citation:
AI system can envision an entire world from a single picture (2024, December 19)
retrieved 19 December 2024
from https://techxplore.com/news/2024-12-ai-envision-entire-world-picture.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Jockey Club Multiple Pathways Initiative

Next Post

Tokyo’s Best Video Game Arcades in Akihabara: Where to Go, What to Do

Next Post
Tokyo’s Best Video Game Arcades in Akihabara: Where to Go, What to Do

Tokyo’s Best Video Game Arcades in Akihabara: Where to Go, What to Do

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Banks must take the lead in supporting women’s success in tech

Banks must take the lead in supporting women’s success in tech

2 months ago
US National Security Experts Warn AI Giants Aren’t Doing Enough to Protect Their Secrets

US National Security Experts Warn AI Giants Aren’t Doing Enough to Protect Their Secrets

12 months ago
Czechs’ ‘Gift for Putin’ Turns Into $2.9M Black Hawk Donation for Ukraine

Czechs’ ‘Gift for Putin’ Turns Into $2.9M Black Hawk Donation for Ukraine

3 months ago
Telco leaders join forces to discuss next steps towards highly autonomous networks

Telco leaders join forces to discuss next steps towards highly autonomous networks

2 years ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.