• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

A visual-linguistic framework that enables open-vocabulary object grasping in robots

Simon Osuji by Simon Osuji
August 2, 2024
in Artificial Intelligence
0
A visual-linguistic framework that enables open-vocabulary object grasping in robots
0
SHARES
2
VIEWS
Share on FacebookShare on Twitter


A visual-linguistic framework that enables open-vocabulary object grasping in robots
Diagram explaining what open-vocabulary grasping entails. Credit: Meng et al, arXiv (2024). DOI: 10.48550/arxiv.2407.13175

To be deployed in a broad range of real-world dynamic settings, robots should be able to successfully complete various manual tasks, ranging from household chores to complex manufacturing or agricultural processes. These manual tasks entail grasping, manipulating and placing objects of different types, which can vary in shape, weight, properties and textures.

Related posts

More Than 800 Google Workers Urge Company to Cancel Any Contracts With ICE and CBP

More Than 800 Google Workers Urge Company to Cancel Any Contracts With ICE and CBP

February 6, 2026
Scientists create smart synthetic skin that can hide images and change shape

Scientists create smart synthetic skin that can hide images and change shape

February 6, 2026

Most existing approaches to enable robotic object grasping and manipulation, however, only allow robots to successfully interact with objects that match or are very similar to those they encountered during training. This means that when they encounter a new (i.e., unseen before) type of object, many robots are unable to grasp it.

A team of researchers at Beihang University and University of Liverpool recently set out to develop a new approach that would overcome this key limitation of systems for robotic grasping. Their paper, posted to the arXiv preprint server, introduces OVGNet, a unified visual-linguistic framework that could enable open-vocabulary learning, which could in turn allow robots to grasp objects in both known and novel categories.

“Recognizing and grasping novel-category objects remains a crucial yet challenging problem in real-world robotic applications,” Meng Li, Qi Zhao and their colleagues wrote in their paper. “Despite its significance, limited research has been conducted in this specific domain.

“To address this, we seamlessly propose a novel framework that integrates open-vocabulary learning into the domain of robotic grasping, empowering robots with the capability to adeptly handle novel objects.”

The researchers’ framework relies on a new benchmark dataset they compiled, called OVGrasping. This dataset contains 63,385 examples of grasping scenarios with objects belonging to 117 different categories, which are divided into base (i.e., known) and novel (i.e., unseen) categories.

“First, we present a largescale benchmark dataset specifically tailored for evaluating the performance of open-vocabulary grasping tasks,” Li, Zhao and their colleagues wrote. “Second, we propose a unified visual-linguistic framework that serves as a guide for robots in successfully grasping both base and novel objects. Third, we introduce two alignment modules designed to enhance visual-linguistic perception in the robotic grasping process.”

OVGNet, the new framework introduced by this team of researchers, is based on a visual-linguistic perception system trained to recognize objects and devise effective strategies to grasp them using both visual and linguistic elements. The framework includes both an image guided language attention module (IGLA) and a language guided attention module (LGIA).

These two modules collectively analyze the overall features of detected objects, enhancing a robot’s ability to generalize its grasping strategies across both known and novel object categories.

The researchers evaluated their proposed framework in a series of tests run in a grasping simulation environment based on pybullet, using a simulated ROBOTIQ-85 robot and UR5 robotic arm. Their framework achieved promising results, outperforming other baseline approaches for robotic grasping in tasks that involved novel object categories.

“Notably, our framework achieves an average accuracy of 71.2% and 64.4% on base and novel categories in our new dataset, respectively,” Li, Zhao and their colleagues wrote.

The OVGrasping dataset compiled by the researchers and the code for their OVGNet framework are open-source and can be accessed by other developers on GitHub. In the future, their dataset could be used to train other algorithms, while their framework could be tested in additional experiments and deployed on other robotic systems.

More information:
Li Meng et al, OVGNet: A Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping, arXiv (2024). DOI: 10.48550/arxiv.2407.13175

Journal information:
arXiv

© 2024 Science X Network

Citation:
A visual-linguistic framework that enables open-vocabulary object grasping in robots (2024, August 1)
retrieved 2 August 2024
from https://techxplore.com/news/2024-07-visual-linguistic-framework-enables-vocabulary.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Henkel Indonesia partners Yayasan Kelestarian dan Edukasi Kaimana to transform primary school building

Next Post

LG C4 OLED Review: The Best High-End TV for Most People

Next Post
LG C4 OLED Review: The Best High-End TV for Most People

LG C4 OLED Review: The Best High-End TV for Most People

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

That Sports News Story You Clicked on Could Be AI Slop

That Sports News Story You Clicked on Could Be AI Slop

1 year ago
Economic Community of West African States (ECOWAS) and Algeria agree to hold regular discussions on the political and security situation in west Africa and the Sahel, as well as on the fate of irregular migrants

Economic Community of West African States (ECOWAS) and Algeria agree to hold regular discussions on the political and security situation in west Africa and the Sahel, as well as on the fate of irregular migrants

2 years ago
Payroll Non-Compliance Defence: Your Best Solution

Payroll Non-Compliance Defence: Your Best Solution

6 months ago
Decoupled style structure in Fourier domain method improves raw to sRGB mapping

Decoupled style structure in Fourier domain method improves raw to sRGB mapping

2 years ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.