• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

NASA-IBM collaboration develops INDUS large language models for advanced science research

Simon Osuji by Simon Osuji
June 25, 2024
in Artificial Intelligence
0
NASA-IBM collaboration develops INDUS large language models for advanced science research
0
SHARES
7
VIEWS
Share on FacebookShare on Twitter


NASA-IBM Collaboration Develops INDUS Large Language Models for Advanced Science Research
Named for the southern sky constellation, INDUS (stylized in all caps) is a comprehensive suite of large language models supporting five science domains. Credit: NASA

Collaborations with private, non-federal partners through Space Act Agreements are a key component in the work done by NASA’s Interagency Implementation and Advanced Concepts Team (IMPACT). A collaboration with International Business Machines (IBM) has produced INDUS, a comprehensive suite of large language models (LLMs) tailored for the domains of Earth science, biological and physical sciences, heliophysics, planetary sciences, and astrophysics and trained using curated scientific corpora drawn from diverse data sources.

Related posts

The Information Networks That Connect Venezuelans in Uncertain Times

The Information Networks That Connect Venezuelans in Uncertain Times

February 1, 2026
Onnit’s Instant Melatonin Spray Keeps Bedtime Uncomplicated

Onnit’s Instant Melatonin Spray Keeps Bedtime Uncomplicated

January 31, 2026

INDUS contains two types of models; encoders and sentence transformers. Encoders convert natural language text into numeric coding that can be processed by the LLM. The INDUS encoders were trained on a corpus of 60 billion tokens encompassing astrophysics, planetary science, Earth science, heliophysics, biological, and physical sciences data. Its custom tokenizer developed by the IMPACT-IBM collaborative team improves on generic tokenizers by recognizing scientific terms like biomarkers and phosphorylated.

Over half of the 50,000-word vocabulary contained in INDUS is unique to the specific scientific domains used for its training. The INDUS encoder models were used to fine tune the sentence transformer models on approximately 268 million text pairs, including titles/abstracts and questions/answers.

By providing INDUS with domain-specific vocabulary, the IMPACT-IBM team achieved superior performance over open, non-domain specific LLMs on a benchmark for biomedical tasks, a scientific question-answering benchmark, and Earth science entity recognition tests. By designing for diverse linguistic tasks and retrieval augmented generation, INDUS is able to process researcher questions, retrieve relevant documents, and generate answers to the questions. For latency sensitive applications, the team developed smaller, faster versions of both the encoder and sentence transformer models.

Validation tests demonstrate that INDUS excels in retrieving relevant passages from the science corpora in response to a NASA-curated test set of about 400 questions. IBM researcher Bishwaranjan Bhattacharjee commented on the overall approach, “We achieved superior performance by not only having a custom vocabulary but also a large specialized corpus for training the encoder model and a good training strategy. For the smaller, faster versions, we used neural architecture search to obtain a model architecture and knowledge distillation to train it with supervision of the larger model.”

INDUS was also evaluated using data from NASA’s Biological and Physical Sciences (BPS) Division. Dr. Sylvain Costes, the NASA BPS project manager for Open Science, discussed the benefits of incorporating INDUS, “Integrating INDUS with the Open Science Data Repository (OSDR) Application Programming Interface (API) enabled us to develop and trial a chatbot that offers more intuitive search capabilities for navigating individual datasets. We are currently exploring ways to improve OSDR’s internal curation data system by leveraging INDUS to enhance our curation team’s productivity and reduce the manual effort required daily.”

At the NASA Goddard Earth Sciences Data and Information Services Center (GES-DISC), the INDUS model was fine-tuned using labeled data from domain experts to categorize publications specifically citing GES-DISC data into applied research areas.

More information:
Bishwaranjan Bhattacharjee et al, INDUS: Effective and Efficient Language Models for Scientific Applications, arXiv (2024). DOI: 10.48550/arxiv.2405.10725

Journal information:
arXiv

Citation:
NASA-IBM collaboration develops INDUS large language models for advanced science research (2024, June 25)
retrieved 25 June 2024
from https://techxplore.com/news/2024-06-nasa-ibm-collaboration-indus-large.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Crypto fear and greed index plunged to 30 amid market volatility, lowest since January 2023

Next Post

South Korea Completes KUH-1 Surion Helicopter Handover to Army

Next Post
South Korea Completes KUH-1 Surion Helicopter Handover to Army

South Korea Completes KUH-1 Surion Helicopter Handover to Army

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

These tips from experts can help your teenager navigate AI companions

These tips from experts can help your teenager navigate AI companions

6 months ago
7 Mentoring Relationships Involving Nigerians

7 Mentoring Relationships Involving Nigerians

2 years ago
Today’s D Brief: EU to miss Ukraine-aid goal; F-16 training base opens; NATO’s cable focus; DOD’s new ethical-AI tools; And a bit more.

Today’s D Brief: EU to miss Ukraine-aid goal; F-16 training base opens; NATO’s cable focus; DOD’s new ethical-AI tools; And a bit more.

2 years ago
At the launch of Liberia’s National Action Plan on Youth, Peace, and Security, Deputy Special Representative of the Secretary-General (DSRSG) Barrie Freeman reaffirmed the United Nation’s (UN) resolve to support the Youth, Peace and Security Agenda

At the launch of Liberia’s National Action Plan on Youth, Peace, and Security, Deputy Special Representative of the Secretary-General (DSRSG) Barrie Freeman reaffirmed the United Nation’s (UN) resolve to support the Youth, Peace and Security Agenda

5 months ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.