Meta Releases Llama 4 — Scout (109B total) and Maverick (400B total) Now Available | Model Release
Meta's Llama 4 family (released Apr 5, 2025) includes Scout (17B active parameters, 16 experts, 109B total) and Maverick (17B active, 128 experts, 400B total).…
Published on MyPrivateClaw
Mar 31, 2026, 6:50 AM UTC
Coverage date
Apr 5, 2025
Last updated
Apr 4, 2026, 5:45 AM UTC
News summary
Meta released the first models in the Llama 4 family on April 5, 2025, introducing two open weight multimodal models built on a Mixture of Experts (MoE) architecture. Llama 4 Scout is a 17B active parameter model with 16 experts (109B total parameters) that fits on a single NVIDIA H100 GPU with INT4 quantization and offers an industry leading 10 million token context window. Llama 4 Maverick uses the same 17B active parameters but scales to 128 experts (400B total), fitting on a single H100 host while outperforming GPT 4o and Gemini 2.0 Flash on a broad range of benchmarks. Both models are natively multimodal, accepting image and text inputs. The models were distilled from Llama 4 Behemoth, a 288B active parameter teacher model still in training at launch time. Behemoth outperforms GPT 4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks according to Meta. The Maverick…