Monday, May 19, 2025
LBNN
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • Documentaries
No Result
View All Result
LBNN

Microsoft unveils Phi-3 family of compact language models

Simon Osuji by Simon Osuji
April 27, 2024
in Artificial Intelligence
0
Microsoft unveils Phi-3 family of compact language models
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Microsoft has announced the Phi-3 family of open small language models (SLMs), touting them as the most capable and cost-effective of their size available. The innovative training approach developed by Microsoft researchers has allowed the Phi-3 models to outperform larger models on language, coding, and math benchmarks.

“What we’re going to start to see is not a shift from large to small, but a shift from a singular category of models to a portfolio of models where customers get the ability to make a decision on what is the best model for their scenario,” said Sonali Yadav, Principal Product Manager for Generative AI at Microsoft.

The first Phi-3 model, Phi-3-mini at 3.8 billion parameters, is now publicly available in Azure AI Model Catalog, Hugging Face, Ollama, and as an NVIDIA NIM microservice. Despite its compact size, Phi-3-mini outperforms models twice its size. Additional Phi-3 models like Phi-3-small (7B parameters) and Phi-3-medium (14B parameters) will follow soon.

“Some customers may only need small models, some will need big models and many are going to want to combine both in a variety of ways,” said Luis Vargas, Microsoft VP of AI.

The key advantage of SLMs is their smaller size enabling on-device deployment for low-latency AI experiences without network connectivity. Potential use cases include smart sensors, cameras, farming equipment, and more. Privacy is another benefit by keeping data on the device.

(Credit: Microsoft)

Large language models (LLMs) excel at complex reasoning over vast datasets—strengths suited to applications like drug discovery by understanding interactions across scientific literature. However, SLMs offer a compelling alternative for simpler query answering, summarisation, content generation, and the like.

“Rather than chasing ever-larger models, Microsoft is developing tools with more carefully curated data and specialised training,” commented Victor Botev, CTO and Co-Founder of Iris.ai.

“This allows for improved performance and reasoning abilities without the massive computational costs of models with trillions of parameters. Fulfilling this promise would mean tearing down a huge adoption barrier for businesses looking for AI solutions.”

Breakthrough training technique

What enabled Microsoft’s SLM quality leap was an innovative data filtering and generation approach inspired by bedtime story books.

“Instead of training on just raw web data, why don’t you look for data which is of extremely high quality?” asked Sebastien Bubeck, Microsoft VP leading SLM research.  

Ronen Eldan’s nightly reading routine with his daughter sparked the idea to generate a ‘TinyStories’ dataset of millions of simple narratives created by prompting a large model with combinations of words a 4-year-old would know. Remarkably, a 10M parameter model trained on TinyStories could generate fluent stories with perfect grammar.

Building on that early success, the team procured high-quality web data vetted for educational value to create the ‘CodeTextbook’ dataset. This was synthesised through rounds of prompting, generation, and filtering by both humans and large AI models.

“A lot of care goes into producing these synthetic data,” Bubeck said. “We don’t take everything that we produce.”

The high-quality training data proved transformative. “Because it’s reading from textbook-like material…you make the task of the language model to read and understand this material much easier,” Bubeck explained.

Mitigating AI safety risks

Despite the thoughtful data curation, Microsoft emphasises applying additional safety practices to the Phi-3 release mirroring its standard processes for all generative AI models.

“As with all generative AI model releases, Microsoft’s product and responsible AI teams used a multi-layered approach to manage and mitigate risks in developing Phi-3 models,” a blog post stated.  

This included further training examples to reinforce expected behaviours, assessments to identify vulnerabilities through red-teaming, and offering Azure AI tools for customers to build trustworthy applications atop Phi-3.

(Photo by Tadas Sar)

See also: Microsoft to forge AI partnerships with South Korean tech leaders

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, artificial intelligence, language models, microsoft, open source, phi-3, small language models



Source link

Related posts

AI is moving fast. Climate policy provides valuable lessons for how to keep it in check

AI is moving fast. Climate policy provides valuable lessons for how to keep it in check

May 19, 2025
We 3D-Printed Luigi Mangione’s Ghost Gun. It Was Entirely Legal

We 3D-Printed Luigi Mangione’s Ghost Gun. It Was Entirely Legal

May 19, 2025
Previous Post

Bringing South Africa to its knees would be self-sabotage for the US

Next Post

US will ‘continue to provide’ ATACMS to Ukraine, national security advisor says

Next Post
US will ‘continue to provide’ ATACMS to Ukraine, national security advisor says

US will ‘continue to provide' ATACMS to Ukraine, national security advisor says

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Ethiopia’s Mulatu Teshome Calls for Diplomatic Pressure Against Eritrea

Ethiopia’s Mulatu Teshome Calls for Diplomatic Pressure Against Eritrea

3 months ago
Electricity Workers Cause Blackout in Kaduna and Surrounding States

Electricity Workers Cause Blackout in Kaduna and Surrounding States

3 months ago
Google launches Gemini 1.5 with ‘experimental’ 1M token context

Google launches Gemini 1.5 with ‘experimental’ 1M token context

1 year ago
GA-ASI, Lockheed Demo Avenger Drone for US Navy ‘Loyal Wingman’ Program

GA-ASI, Lockheed Demo Avenger Drone for US Navy ‘Loyal Wingman’ Program

6 months ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0
  • Matthew Slater, son of Jackson State great, happy to see HBCUs back at the forefront

    0 shares
    Share 0 Tweet 0
  • Dolly Varden Focuses on Adding Ounces the Remainder of 2023

    0 shares
    Share 0 Tweet 0
  • US Dollar Might Fall To 96-97 Range in March 2024

    0 shares
    Share 0 Tweet 0
  • Privacy Policy
  • Contact

© 2023 LBNN - All rights reserved.

No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • Documentaries
  • Quizzes
    • Enneagram quiz
  • Newsletters
    • LBNN Newsletter
    • Divergent Capitalist

© 2023 LBNN - All rights reserved.