Home
» Wiki
»
Meta Launches Llama 4: The Ultimate Multi-modal LLM
Meta Launches Llama 4: The Ultimate Multi-modal LLM
Normally, tech giants rarely announce products on the weekend. However, Meta surprised everyone by introducing the Llama 4 model line last weekend. This series includes three versions: Llama 4 Scout, Llama 4 Maverick and Llama 4 Behemoth.
Llama 4 scout: "monster" performance on single gpu
Llama 4 Scout is the smallest model, boasting 17 billion parameters and 16 experts. Meta claims Scout is the best multimodal model in its class, outperforming Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 in standard AI benchmarks. Notably, despite its impressive performance, the model can run on a single NVIDIA H100 GPU. Scout also supports a context window of up to 10 million tokens, an industry record, although its real-world performance remains to be seen.
Llama 4 maverick: a formidable rival to gpt-4o
Llama 4 Maverick is the “popular” version with 17 billion parameters but increased to 128 experts. Meta claims Maverick outperforms GPT-4o and Gemini 2.0 Flash in industry benchmarks. Maverick’s chatbot test version scored 1,417 points on LMArena, ranking 2nd among the top LLMs today.
Llama 4 Behemoth – the largest model in the series – is still in training. With 288 billion parameters and 16 experts, Behemoth is announced by Meta to outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on many standard AI benchmarks.
Availability
Llama 4 Scout and Llama 4 Maverick are now available for download on llama.com and Hugging Face. For general users, the models are being integrated into Meta AI on WhatsApp, Messenger, Instagram Direct, and the web.
Microsoft also quickly announced support for Llama 4 Scout and Maverick on Azure AI Foundry as a managed computing service. Developers can find them as:
With the Llama 4 series, Meta continues to affirm its ambition to lead the AI race, especially in the multimodal segment - a field that is attracting fierce competition from Google, OpenAI and Anthropic.