DeepSeek claims that the "reasoning" model breaks O1 of Openai on a specific benchmark.

China’s AI Lab Deepseek has released an open version of the so-called inference model, Deepseek-R1, which claims to perform the same performance as O1 in Openai on a specific AI benchmark.

R1 can be obtained from the AI development platform that hugs your face under the MIT license. In other words, it can be used commercially without restrictions. According to DeepSeek, R1 broke O1 on benchmark AIME, Math-500, and SWE bench. AIME adopts other models to evaluate the performance of the model, but the Math-500 is a collection of words. On the other hand, the SWE bench focuses on programming tasks.

Since R1 is a reasoning model, there is actually a check itself. This helps avoid some pitfalls that normally stumble. The reasoning model takes a little time to reach a solution compared to a typical non -seasonal model, but usually it takes a few seconds longer than a few seconds. The advantage is that it tends to be more reliable in domains such as physics, science, and mathematics.

R1 contains 67.1 billion parameters, and DeepSeek has been revealed in a technical report. Parameters are almost compatible with model problem solving skills, and models with more parameters generally have a better performance than those with fewer parameters.

In fact, the 67.1 billion parameters are large, but DeepSeek has also released a “distillation” version of the size of 1.5 billion parameters to 70 billion parameters. The smallest one can be run on the lap top. For complete R1, more powerful hardware is required, but it can be used 90 % to 95 % cheaper than O1 of OPENAI through DeepSeek API.

CLEM DELANGUE, a CEO of Hugging Face, is 5 times the official download of X on Monday that developers on the platform have created a “derivative” model of more than 500 R1 combining 2.5 million downloads. Was created. R1 was obtained.

It was released just a few days ago, and more than 500 derivative models of @deepseek_ai have been created around the world with 2.5 million downloads (5 times the weight).

Power of distributed open source AI!

-Krem (@crementdelangue) January 27, 2025

R1 has drawbacks. Because it is a Chinese model, it is subject to benchmarks by the Chinese Internet regulation authorities, and the answer is guaranteed to “embody the core socialist value.” R1 does not answer questions about Tiananmen Square or Taiwan’s autonomy.

Deepseek R1 refusal — Filtering during R1 running. Image credit: Deepseek

Many Chinese AI systems, including other inference models, have refused to respond to topics that could increase domestic regulatory authorities, such as speculation about the Xi Jinping system.

The R1 arrived a few days after the retirement Biden administration proposed a more strict exports and restrictions of AI technology for Chinese ventures. Chinese companies have already hindered advanced AI chips, but if the new rules are enabled, companies are needed to boot straps a sophisticated AI system. Face a more strict cap with both semiconductor technology and model.

In last week’s policy document, Openai urged the US government to support US AI development. In an interview with the information, Openai’s vice president, CHRIS Lehane, has single -out of HIGH FLYER CAPITAL Management, a parent of Deepseek, as a particularly concerned organization.

So far, at least three Chinese laboratories, Deepseek, Alibaba, and Kimi, owned by Chinese Unicorn Moon Shot AI, have created a model that claims rival O1. (Deepseek is the first, and we announced the R1 preview in late November.) Dean Ball, an AI researcher of George Mason UNIVERSITY, says that this trend will continue with Chinese AI labs. He said he suggested. Fast follower. “

“The impressive performance of the DeepSeek’s distillation model (…) means that very competent infr ‘that is widely growing and can be executed with local hardware,” says Ball.

This story was originally released on January 20, updated on January 27, and details were updated.

TechCrunch has a newsletter focusing on AI! Sign up here and get it on the reception tray every Wednesday.

Source link

What's Hot

Infowars: Chinese AI Memes and US Media Barb | Donald Trump

3 coins to continue purchasing during a crypto slump

The US Supreme Court orders temporary suspension of deportation under antique law | Donald Trump News

DeepSeek claims that the “reasoning” model breaks O1 of Openai on a specific benchmark.

TechStars will increase startup funding to $220,000, reflecting the YC structure

The new kids show will come with a crypto wallet when it debuts this fall

A comprehensive list of 2025 tech layoffs

Tiktoker sues Roblox for Charli XCX ‘Apple’ Dance

Startup Weekly: Mixed Messages from Venture Capital

The White House replaces covid.gov website with “laborek” theory

Infowars: Chinese AI Memes and US Media Barb | Donald Trump

3 coins to continue purchasing during a crypto slump

The US Supreme Court orders temporary suspension of deportation under antique law | Donald Trump News

According to Cathie Wood, one best cryptocurrency buys before it rises 1,686%

Cancelling the Joy Reed Show is “mistakes”

Black melodrama has a possibility

The “Facts of Life” star died in 83

Cara Sophia Gascon joins Oscar despite social media controversy

Our Picks

Infowars: Chinese AI Memes and US Media Barb | Donald Trump

3 coins to continue purchasing during a crypto slump

The US Supreme Court orders temporary suspension of deportation under antique law | Donald Trump News

Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

President Trump mints $31 billion in new official $TRUMP crypto meme coin

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

Subscribe to Updates

What's Hot

DeepSeek claims that the “reasoning” model breaks O1 of Openai on a specific benchmark.

Related Posts