• Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints
  • Business
  • Markets
  • Politics
  • Crypto
  • Finance
  • Intelligence
    • Policy Intelligence
    • Security Intelligence
    • Economic Intelligence
    • Fashion Intelligence
  • Energy
  • Technology
  • Taxes
  • Creator Economy
  • Wealth Management
  • LBNN Blueprints

How its performance compares against other AI tools

Simon Osuji by Simon Osuji
February 6, 2025
in Artificial Intelligence
0
How its performance compares against other AI tools
0
SHARES
3
VIEWS
Share on FacebookShare on Twitter


DeepSeek
Credit: Unsplash/CC0 Public Domain

China’s new DeepSeek large language model (LLM) has disrupted the US-dominated market, offering a relatively high-performance chatbot model at significantly lower cost.

Related posts

Wacom MovinkPad 11 Tablet Review: A Portable Sketch Pad

Wacom MovinkPad 11 Tablet Review: A Portable Sketch Pad

February 11, 2026
Exploring AI Companion’s Benefits and Risks

Exploring AI Companion’s Benefits and Risks

February 11, 2026

The reduced cost of development and lower subscription prices compared with US AI tools contributed to American chip maker Nvidia losing US$600 billion (£480 billion) in market value over one day. Nvidia makes the computer chips used to train the majority of LLMs, the underlying technology used in ChatGPT and other AI chatbots. DeepSeek uses cheaper Nvidia H800 chips over the more expensive state-of-the-art versions.

ChatGPT developer OpenAI reportedly spent somewhere between US$100 million and US$1 billion on the development of a very recent version of its product called o1. In contrast, DeepSeek accomplished its training in just two months at a cost of US$5.6 million using a series of clever innovations.

But just how well does DeepSeek’s AI chatbot, R1, compare with other, similar AI tools on performance?

DeepSeek claims its models perform comparably to OpenAI’s offerings, even exceeding the o1 model in certain benchmark tests. However, benchmarks that use Massive Multitask Language Understanding (MMLU) tests evaluate knowledge across multiple subjects using multiple choice questions. Many LLMs are trained and optimized for such tests, making them unreliable as true indicators of real-world performance.

An alternative methodology for the objective evaluation of LLMs uses a set of tests developed by researchers at Cardiff Metropolitan, Bristol and Cardiff universities—known collectively as the Knowledge Observation Group (KOG). These tests probe LLMs’ ability to mimic human language and knowledge through questions that require implicit human understanding to answer. The core tests are kept secret, to avoid LLM companies training their models for these tests.

KOG deployed public tests inspired by work by Colin Fraser, a data scientist at Meta, to evaluate DeepSeek against other LLMs. The following results were observed:

Putting DeepSeek to the test: how its performance compares against other AI tools
LLM Performance test. Credit: The Conversation

The tests used to produce this table are “adversarial” in nature. In other words, they are designed to be “hard” and to test LLMs in way that are not sympathetic to how they are designed. This means the performance of these models in this test is likely to be different to their performance in mainstream benchmarking tests.

DeepSeek scored 5.5 out of 6, outperforming OpenAI’s o1—its advanced reasoning (known as “chain-of-thought”) model—as well as ChatGPT-4o, the free version of ChatGPT. But Deepseek was marginally outperformed by Anthropic’s ClaudeAI and OpenAI’s o1 mini, both of which scored a perfect 6/6. It’s interesting that o1 underperformed against its “smaller” counterpart, o1 mini.

DeepThink R1—a chain-of-thought AI tool made by DeepSeek—underperformed in comparison to DeepSeek with a score of 3.5.

This result shows how competitive DeepSeek’s chatbot already is, beating OpenAI’s flagship models. It is likely to spur further development for DeepSeek, which now has a strong foundation to build upon. However, the Chinese tech company does have one serious problem the other LLMs do not: censorship.

Censorship challenges

Despite its strong performance and popularity, DeepSeek has faced criticism over its responses to politically sensitive topics in China. For instance, prompts related to Tiananmen Square, Taiwan, Uyghur Muslims and democratic movements are met with the response: “Sorry, that is beyond my current scope.”

But this issue is not necessarily unique to DeepSeek, and the potential for political influence and censorship in LLMs more generally is a growing concern. The announcement of Donald Trump’s US$500 billion Stargate LLM project, involving OpenAI, Nvidia, Oracle, Microsoft, and Arm, also raises fears of political influence.

Additionally, Meta’s recent decision to abandon fact-checking on Facebook and Instagram suggests an increasing trend toward populism over truthfulness.

DeepSeek’s arrival has caused serious disruption to the LLM market. US companies such as OpenAI and Anthropic will be forced to innovate their products to maintain relevance and match its performance and cost.

DeepSeek’s success is already challenging the status quo, demonstrating that high-performance LLM models can be developed without billion-dollar budgets. It also highlights the risks of LLM censorship, the spread of misinformation, and why independent evaluations matter.

As LLMs become more deeply embedded in global politics and business, transparency and accountability will be essential to ensure that the future of LLMs is safe, useful and trustworthy.

Provided by
The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.The Conversation

Citation:
Putting DeepSeek to the test: How its performance compares against other AI tools (2025, February 5)
retrieved 5 February 2025
from https://techxplore.com/news/2025-02-deepseek-ai-tools.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.





Source link

Previous Post

Leading the Way to Becoming an Energy Hub with In-Country Value

Next Post

Mantis Security: A proactive approach to security

Next Post
Mantis Security: A proactive approach to security

Mantis Security: A proactive approach to security

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

RECOMMENDED NEWS

Sharjah Police drops more than 7,000 traffic fines

Sharjah Police drops more than 7,000 traffic fines

4 months ago
Decentralized platform is letting users own a piece of the AI models trained on their data

Decentralized platform is letting users own a piece of the AI models trained on their data

10 months ago
MTN partners with WIM Technologies – IT News Africa

MTN partners with WIM Technologies – IT News Africa

2 years ago
Subsidising access: How Namibia is using public funds to tackle electricity inequality

Subsidising access: How Namibia is using public funds to tackle electricity inequality

7 months ago

POPULAR NEWS

  • Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    Ghana to build three oil refineries, five petrochemical plants in energy sector overhaul

    0 shares
    Share 0 Tweet 0
  • The world’s top 10 most valuable car brands in 2025

    0 shares
    Share 0 Tweet 0
  • Top 10 African countries with the highest GDP per capita in 2025

    0 shares
    Share 0 Tweet 0
  • Global ranking of Top 5 smartphone brands in Q3, 2024

    0 shares
    Share 0 Tweet 0
  • When Will SHIB Reach $1? Here’s What ChatGPT Says

    0 shares
    Share 0 Tweet 0

Get strategic intelligence you won’t find anywhere else. Subscribe to the Limitless Beliefs Newsletter for monthly insights on overlooked business opportunities across Africa.

Subscription Form

© 2026 LBNN – All rights reserved.

Privacy Policy | About Us | Contact

Tiktok Youtube Telegram Instagram Linkedin X-twitter
No Result
View All Result
  • Home
  • Business
  • Politics
  • Markets
  • Crypto
  • Economics
    • Manufacturing
    • Real Estate
    • Infrastructure
  • Finance
  • Energy
  • Creator Economy
  • Wealth Management
  • Taxes
  • Telecoms
  • Military & Defense
  • Careers
  • Technology
  • Artificial Intelligence
  • Investigative journalism
  • Art & Culture
  • LBNN Blueprints
  • Quizzes
    • Enneagram quiz
  • Fashion Intelligence

© 2023 LBNN - All rights reserved.