Meta Releases First Llama 4 Models with MoE Technology

Meta debuts Llama 4 Scout and Maverick with 'mixture of experts' tech, offering better performance, safer output, and multilingual support.

Meta’s Llama 4 models were released. These are the first from Meta to use mixture of experts (MoE) technology.

Two models were shared in the update:

  • Llama 4 Scout: 17B active parameters, 16 experts. It runs on one NVIDIA H100 GPU.

  • Llama 4 Maverick: 17B active parameters, 128 experts.

MoE works by splitting a task into parts. Each part is handled by a special neural network expert. These experts give answers that are combined to solve the task. MoE models like DeepSeek-V3 and Mixtral 8x7B use this method. OpenAI has not confirmed MoE use but has shown interest.

Both models are built on Llama 4 Behemoth. That larger model uses 288B active parameters, 16 experts, and nearly 2 trillion total parameters. It is still in training.

Meta trained Behemoth using FP8 on 32,000 GPUs. The team reached 390 TFLOPs per GPU. The training used over 30 trillion tokens, including text, image, and video data—twice the size of Llama 3’s training data.

Meta also created a new method called MetaP. This helps set key settings like learning rates and initialization. This method allowed training across 200 languages. More than 100 of these had over 1 billion tokens each.

Meta did not share which content was used to train Llama 4. This is important, as the company has been accused of using pirated data.

The company shared benchmark scores. These scores show the new models beat others on many tests.

Meta also worked to reduce bias. Many LLMs are known to lean left. Meta said this bias comes from the internet data used for training. The new models were tuned to avoid this. They now show a more neutral tone.

Meta’s fact-checking will end on Monday. A new system called “Community Notes” will be used. These notes will be written by users and added over time.

The new models can respond to more prompts without judgment. Their replies now match Grok’s bias rate and are better than Llama 3.3.

Meta wants to lower this rate even more. It hopes fewer topics will be skipped by the models.

Meta also tested safety using a system called GOAT. This simulates attacks on the models and finds weak spots. GOAT helps human testers focus on harder problems.

The models are now available to download from Meta and Hugging Face. Meta says this supports open source. But some rights are limited for EU users, so critics say the models are not fully open source.


📌 Source : theregister