Meta Llama 4正式发布:开源模型的最强反击,Scout和Maverick能打赢GPT-4吗 | Meta Llama 4 Released: Open-Source AI Fights Back with Scout and Maverick
2026-06-16 | WDSEGA
Meta 把 Llama 4 发布了出来,这是开源大模型阵营迄今为止最大的一次反击。
两个版本,两个定位:Scout 是”轻量高效”,Maverick 是”顶级能力”。都是开源,都可以本地部署,都支持商用。
Llama 4 Scout:1000万上下文的意义
Scout 的最大卖点是 1000 万 token 的上下文窗口。
1000 万是什么概念?一本《哈利·波特》全集大约是 150 万词,换成 token 大概 200 万。1000 万 token 能装下整个系列还有剩。整个代码库?完全塞得进去。
但上下文窗口大不代表真的能用好。关键问题是:在如此长的上下文里,模型能否均匀地”注意”到任意位置的信息,还是说只对首尾敏感、中间部分稀里糊涂?
Meta 给出的数据是:Scout 在长文档理解测试里的表现比 Gemini 1.5 Pro 好,但没有跟 Gemini 2.5 Pro 横向比。这个数据要打折扣听。
Llama 4 Maverick:多模态的认真之作
Maverick 的定位是顶级能力,对标 GPT-4o 和 Gemini 2.0 Flash。
它支持图像、视频、文本输入,输出文本。在 Meta 的测试里,Maverick 的多模态基准分数超过了 GPT-4o 和 Gemini 2.0 Flash。
第三方测试结果更复杂一些:在代码生成和数学推理上,Maverick 和 GPT-4o 基本持平;在创意写作和细腻对话上,Maverick 明显落后;在多轮对话里的一致性上,Maverick 有时候答非所问。
综合评价:Maverick 是目前开源模型里最强的多模态模型,但跟 GPT-4o 还有差距,差距主要在”像人”这件事上。
开源的真实意义
Llama 4 是 Meta AI License 2.0 授权,允许商用,不要求分享修改版,月活用户超过 7 亿的产品除外。这个”除外条款”对大多数开发者没有影响,但对字节、百度这种体量的公司有影响。
为什么 Meta 要开源最强模型?
答案是商业逻辑,不是慈善:Meta 需要 AI 生态,不需要 AI 护城河。Meta 不卖 AI API,它靠广告活着。让 AI 生态更繁荣,让更多开发者基于 Llama 构建,对 Meta 的广告业务是正向贡献。
OpenAI 和 Anthropic 靠 API 挣钱,所以要闭源。Meta 靠广告挣钱,所以要开源。这不是哲学,是生意。
对本地部署开发者的影响
Llama 4 Scout 的 17B 参数版本,用消费级 GPU(比如 RTX 4090)能跑。Maverick 的 400B 混合专家(MoE)版需要多张 A100。
对大多数独立开发者来说,Scout 是可及的,Maverick 需要租云 GPU。
如果你在用 Ollama、LM Studio 跑本地模型,Llama 4 Scout 是目前能跑的最强开源模型,值得换上去测试。
一句话总结
Llama 4 是目前最强的开源大模型,在效率和部分能力上可以平替 GPT-4o,但在”像人”这件事上还有差距。对需要本地部署、不想付 API 费用、不介意自己调参的开发者来说,现在是从 Llama 3 迁移到 Llama 4 的好时机。
This article is also published on my blog: wdsega.github.io
Meta Llama 4 Released: Open-Source AI Fights Back with Scout and Maverick
Meta dropped Llama 4. It’s the biggest counterattack the open-source AI camp has mounted to date.
Two models, two positions: Scout is “lightweight efficiency,” Maverick is “top-tier capability.” Both are open source, both support local deployment, both allow commercial use.
Llama 4 Scout: What 10M Context Actually Means
Scout’s headline feature is a 10 million token context window.
For scale: the complete Harry Potter series is roughly 1.5M words, about 2M tokens. 10M fits the entire series with room to spare. An entire codebase? Fits.
The question isn’t whether the window is large — it’s whether the model can uniformly attend to information anywhere in that window, or whether it only reliably “reads” the beginning and end. Meta reports Scout outperforms Gemini 1.5 Pro on long-document benchmarks, but notably avoids comparing against Gemini 2.5 Pro. Take that with appropriate skepticism.
Llama 4 Maverick: Multimodal Taken Seriously
Maverick targets GPT-4o and Gemini 2.0 Flash. It accepts image, video, and text inputs; outputs text. Meta’s internal benchmarks show Maverick beating both GPT-4o and Gemini 2.0 Flash on multimodal tasks.
Third-party evaluations tell a more nuanced story: roughly on par with GPT-4o for coding and math; noticeably behind on creative writing and natural conversation; occasionally incoherent in long multi-turn exchanges.
Summary: Maverick is the strongest open-source multimodal model available. It still trails GPT-4o on “sounding human,” but the gap has narrowed significantly.
Why Meta Keeps Open-Sourcing Its Best Models
Llama 4 ships under Meta AI License 2.0 — commercial use allowed, no copyleft requirement, except for products with over 700M monthly active users.
The reason is business logic, not philosophy. Meta doesn’t sell AI APIs; it sells ads. A thriving AI ecosystem where developers build on Llama is good for Meta’s ad business. OpenAI and Anthropic need closed models because they sell API access. Meta needs an open ecosystem because it sells attention. Different revenue models, different strategies.
For Local Deployment Developers
Llama 4 Scout at 17B parameters runs on consumer GPUs (RTX 4090 class). Maverick at 400B MoE requires multiple A100s.
If you’re running local models with Ollama or LM Studio, Scout is now the strongest open-source model you can run. It’s a good time to upgrade from Llama 3.
Originally published at wdsega.github.io