DeepSeek, it moves. There is a new AI champion in the town -and they are Americans.
On Thursday, AI2, a non -profit AI Research Institute based in Seattle, has released a model that claims to surpass Deepseek V3, one of the major systems of Deepseek in China.
According to the internal test of AI2, the AI2 model called Tulu3-405B also breaks Openai GPT-4O with a specific AI benchmark. In addition, unlike GPT-4O (and Deepseek V3), Tulu3-405B is open source. In other words, all components required to copy from zero are free and acceptable.
AI2 spokeswoman told THECHCRUNCH that the lab emphasizes the Possibility of the United States, which leads the world’s highest development of the class generated AI model.
“This milestone is an important moment of the future of the open AI, and will strengthen the US status as a leader in a competitive open source model,” said Spokesman. “With this launch, AI2 has a strong U.S. -developed alternative to DeepSeek models. In addition to AI development, the United States can lead an independent and competitive open source AI as a high -tech giant. Is a very important moment. “
Tulu3-405B is a fairly large model. According to AI2, it contains 400 billion parameters and needed to run 256 GPUs in parallel with trains. Parameters are almost compatible with model problem solving skills, and models with more parameters generally have a better performance than those with fewer parameters.

According to AI2, one of the keys to achieve competitive performance on Tulu3-405B was a method called Renforcement Learning with verification rewards. Verifying rewards or enhanced learning with RLVR train task models with “verification” results, such as solving mathematics and following instructions.
AI2 is a benchmark POPQA, a set of 14,000 expertise questions supplied from Wikipedia, and claims that the TULU3-405B has broken not only DeepSeek V3 and GPT-4O but also Meta’s LLAMA 3.1 405B model. The TULU3-405B has gained the best performance among any models in the GSM8K class. This is a test that includes elementary school -level mathematics issues.
Tulu3-405B can be tested via AI2 Chatbot Web app. The code that trains the model has a face on the GitHub and AI dev platform. Get it when it’s hot -and before the flagship AI model that creates the next benchmark appears.