Close Menu
  • Home
  • AI
  • Business
  • Crypto
  • Entertainment
  • Finance
  • LIfe
  • Market
  • Sports
  • US
  • Tech

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

US judge blocks Trump’s efforts to ban Harvard University from registering foreign students | Education News

May 23, 2025

Openai goes with Jony Ive when Google plays AI catchup

May 23, 2025

TechCrunch Mobility: AI BET from Uber Freight, Robotaxi Caweat from Tesla and Nikola trucks hit auction blocks

May 23, 2025
Facebook X (Twitter) Instagram
XMcnx
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • Home
  • AI
  • Business
  • Crypto
  • Entertainment
  • Finance
  • LIfe
  • Market
  • Sports
  • US
  • Tech
XMcnx
Home » Humanity has used Pokemon to benchmark the latest AI models
AI

Humanity has used Pokemon to benchmark the latest AI models

By supportFebruary 24, 2025No Comments2 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Pokemon.png
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link


Humanity has used Pokemon to benchmark the latest AI models. Yes, really.

In a blog post published Monday, Anthropic said it had tested its latest model, the Claude 3.7 Sonnet, with Game Boy Classic Pokémon Red. The company equips the model with basic memory, screen pixel input, and function calls to press buttons, allowing it to navigate around the screen and play Pokemon continuously.

A unique feature of Claude 3.7 Sonnet is its ability to engage in “extended thinking.” Like Openai’s O3-Mini and Deepseek’s R1, Claude 3.7 Sonnet can “infer” through challenging problems by applying more computing and spending more time.

It was obviously convenient in Pokemon Red.

Compared to the previous version of Claude 3.0 sonnet, Claude 3.7 sonnet, unable to leave the house in Palette Town, where the story begins, fought against three Pokemon Gym leaders and won a badge.

Human Pokémon Red
Image credits: Humanity

Currently, it is not clear how much computing it takes for Claude 3.7 sonnets to reach those milestones, and how long each took. Humanity only said that the model took 35,000 actions to reach the last gym leader, the Surge.

Until some enterprising developers are found, that’s definitely not the case.

Pokemon Red is the benchmark for toys above all else. However, there is a long history of games used for AI benchmark purposes. In the past few months alone, many new apps and platforms have emerged to test the model’s gameplay abilities in titles ranging from Street Fighter to Pictory.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleAlia Dastagir talks about her debut book on violence against women
Next Article Apple announces $500 million investment in the US Donald Trump News
support

Related Posts

AI

Openai goes with Jony Ive when Google plays AI catchup

By supportMay 23, 2025
AI

Microsoft says that Aurora AI can accurately predict air quality, typhoons, etc.

By supportMay 23, 2025
AI

Only 3 days left to save up to $900 to destroy the 2025 pass

By supportMay 23, 2025
AI

Sessions at Iliana Quinonez on Scaling AI Startup on Google Cloud: AI

By supportMay 23, 2025
AI

After Klarna, Zoom CEO will also use AI avatars in Quarterly Call

By supportMay 23, 2025
AI

Human CEOs argue that AI models are less hallucinating than humans

By supportMay 23, 2025
Add A Comment

Comments are closed.

Don't Miss

US judge blocks Trump’s efforts to ban Harvard University from registering foreign students | Education News

By supportMay 23, 2025

A US judge issued a temporary restraining order on Harvard’s efforts to prevent foreign students…

Openai goes with Jony Ive when Google plays AI catchup

May 23, 2025

TechCrunch Mobility: AI BET from Uber Freight, Robotaxi Caweat from Tesla and Nikola trucks hit auction blocks

May 23, 2025

Microsoft says that Aurora AI can accurately predict air quality, typhoons, etc.

May 23, 2025
Top Posts

Cancelling the Joy Reed Show is “mistakes”

February 26, 2025

Black melodrama has a possibility

February 26, 2025

The “Facts of Life” star died in 83

February 25, 2025

Cara Sophia Gascon joins Oscar despite social media controversy

February 25, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to XMcnx – your trusted source for insightful information about the world of Crypto, Market trends, the latest developments in the US, cutting-edge AI technologies, Tech innovations, and Finance.

At XMcnx, our mission is to provide you with timely, accurate, and relevant news and analyses that empower you to stay ahead in an ever-evolving digital world. We understand the challenges of navigating through the complexities of modern markets, technology, and financial systems. That’s why we’re dedicated to delivering high-quality content that helps you make informed decisions.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

US judge blocks Trump’s efforts to ban Harvard University from registering foreign students | Education News

May 23, 2025

Openai goes with Jony Ive when Google plays AI catchup

May 23, 2025

TechCrunch Mobility: AI BET from Uber Freight, Robotaxi Caweat from Tesla and Nikola trucks hit auction blocks

May 23, 2025
Most Popular

TikTok announces it will go dark on Sunday without ‘definitive’ guarantees

January 18, 2025

President Trump mints $31 billion in new official $TRUMP crypto meme coin

January 18, 2025

El Salvador’s secret weapon? Stacey Herbert talks about the company’s extensive Bitcoin education program

January 18, 2025
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 xmcnx. Designed by xmcnx.

Type above and press Enter to search. Press Esc to cancel.