Meta released its new collection of AI models, the Llama 4, with the Llama family on Saturday.
There are a total of four new models. Llama4Scout, Llama 4 Maverick, and Llama 4 Behemoth. Mehta said everything is trained on “a large amount of invalid text, images and video data” to give them a “broad visual understanding.”
The success of the open model from China’s AI Lab Deepseek reportedly performed more than PAR than Meta’s previous flagship Llama model, kicking Llama developments overdrive. Meta is said to have a war chamber scrambled to decipher how Deepseek reduced the cost of running and deploying models such as the R1 and V3.
Scout and Maverick are openly available from Meta’s partners, including the AI Dev platform, which embraces the llama.com and the AI Dev platform, but Behemoth is still in training. According to Meta, Meta AI is an AI-powered assistant across apps, including WhatsApp, Messenger and Instagram, and has been updated to use Llama 4 in 40 countries. The multimodal feature is currently limited to the US in English.
Some developers may have issues with their Llama 4 licenses.
Users and businesses are prohibited from using EU “residences” or “major establishments” to use or distribute models that are the result of governance requirements imposed by local AI and data privacy laws. (In the past, Meta has accused these laws of overburdening them.) Furthermore, like previous LAMA releases, companies with over 700 million active users per month must request a special license from META that Meta can allow or deny at its sole discretion.
“These 4 Lama models mark the beginning of a new era in the Lama ecosystem,” Meta wrote in a blog post. “This is just the beginning of the Lama 4 collection.”

According to Meta, Llama 4 is the first cohort of models using a mixture of expert (MOE) architectures, and states that it is computationally efficient due to query training and response. The MOE architecture essentially breaks down data processing tasks into subtasks and delegates them to a smaller, specialized “expert” model.
For example, Maverick has a total of 400 billion parameters, while 17 billion active parameters across 128 “experts” are just 17 billion parameters. (The parameters correspond almost to the model’s problem-solving skills.) Scout has 17 billion active parameters, 16 experts, and a total of 100 billion parameters.
According to internal testing of Meta, the company says it is best for “common assistant chat” use cases. Use cases such as creative writing surpass models such as Openai’s GPT-4o and Google’s Gemini 2.0 in specific coding, inference, multilingual, long context, and image benchmarking. However, Maverick has not measured more capable recent models, such as Google’s Gemini 2.5 Pro, Anthropic’s Claude 3.7 Sonnet, and Openai’s GPT-4.5.
The strengths of scouts lie in tasks such as document summary and inference on a large codebase. Unique has a very large context window: 10 million tokens. (The “token” refers to bits in raw text. For example, the word “fantastic” is divided into “fan”, “TAS” and “TIC”).
Scout can be run on a single Nvidia H100 GPU, according to Meta’s calculations, but Maverick requires an NVIDIA H100 DGX system or equivalent.
The unreleased giant of Meta needs even more powerful hardware. According to the company, Behemoth has 288 billion active parameters, 16 experts and a total of nearly 2 trillion. Meta’s internal benchmarks include Behemoth, which outperforms the GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro (rather than 2.5 Pro), with some ratings (not 2.5 Pro) measuring STEM skills like math problem solving.
Notably, it is not a proper “inference” model along Openai’s O1 and O3-Mini lines. Inference models fact-check their answers and generally respond more reliably to the questions, but as a result, they take longer and provide answers than traditional “irrational” models.

Interestingly, Meta states that he has adjusted all Llama 4 models to refuse to frequently answer “controversial” questions. According to the company, Llama 4 responds to “discussed” political and social topics that the previous harvest of the Llama model does not. Additionally, the company says the Llama 4 is “dramatically balanced,” which encourages flat-outs not to be entertained.
“(y)ou can be relied on (lllama 4) to provide a beneficial and de facto response without judgment,” a Meta spokesperson told Techcrunch. “(w) I’m able to answer more questions, respond to a variety of different perspectives (…) and keep llamas more responding to them so they don’t like other opinions.”
These tweaks have led White House allies to accuse the AI chatbots of being “wake up” politically.
Many close ties to President Donald Trump, including billionaire Elon Musk and Crypto and AI “Emperor” David Sachs, have argued for the censorship and conservative views of the popular AI chatbot. Sacks historically singled Openai’s ChatGpt as “programmed to wake up” and was not dishonest about political subjects.
In reality, AI bias is an cumbersome technical issue. Musk’s own AI company Xai has struggled to create chatbots that do not support political views on others.
This has not stopped companies including Openai from adjusting their AI models to answer more questions than before.