Llama 4 - I tested the new Meta IA model

Mark Zuckerberg, who must have the balls to have supported Trump, tries to drown his sorrow in lamas farm. So not idiotic lamas that hunt as well as your mother -in -law but rather digital lamas, namely the famous Meta LLM, which could well revolutionize your personal projects with their new approach!

He has therefore just announced the provision of LLAMA 4their new multi-modal model (it can understand the text and the images) which has been brought over almost 30 billion tokens (or more Llama 3) and has a context of 10 million tokens. Basically, it is like taking it the equivalent of 8,000 pages of text, the complete Bible + the entire trilogy of the Lord of the Rings + the user manual of your microwave, and it will remember any time.

Available in 2 versions, this is the first model to use the MOE architecture (Mixture of Expert). It is a mode of operation that divides the tasks into a problem, which allows you to activate only the necessary parts of the model on each specific task.

The scout version is the one I just tested right now. It is :

17 billion active parameters with 16 experts (MOE)
109 billion parameters in total
He holds on a single NVIDIA H100 GPU (with Int4 quantification)
Has a context window of 10 million tokens
And is very oriented efficiency and optimized performance

The other model, Maverick is even more balèze but intended to compete squarely models like GPT 4O or Gemini 2.0 Flash.

17 billion active parameters with 128 experts
400 billion parameters in total
He also holds on a single NVIDIA H100 GPU
And as I said, it has performance comparable to high -end models

These models are available for download at Meta and Huggingface If you have a powerful machine enough to try it, but you can also test them via WhatsApp, Messenger, Instagram Direct Messenger or directly on the Meta.ai. And it’s free!

For my part, on my Mac Studio M4, I managed to run Scout, at the speed of 39 tokens / seconds. I’m happy, it’s pretty fast, and I will be able to use this model in my Dev de Conquest de Conquest de Conquest! To give you an idea, I was able to generate a complete answer of 569 words to a question on the history of the Super Nintendo in less than 20 seconds. And that is on consumer equipment, not on a supercomputer.

SCR 20250407 Jrwe

If you want to test this, you will have to install MLX-LM with pip and launch this order:

mlx_lm.generate --model mlx-community/llama-4-Scout-17B-16E-Instruct-4bit --prompt "Combien fait 3+2 ?" --temp 1 -m 4096

SCR 20250407 JMZO

I tried Maverick too but in vain (it was predictable & mldr;)

SCR 20250407 JKNM

What is really interesting with these new models is this thing of mixture of experts (MOE) because it makes it possible to drastically reduce the calculations necessary for an answer, it therefore reduces costs, it allows you to have a lower latency and especially in the future, it will allow Meta to create larger models (smarter what & mldr;) with the same resources. And that’s beautiful!

With such a context window, you could have it analyzed the entire source code of a complex project, process several scientific documents at the same time, or even summarize all your emails from the last 6 months in a single request. No more overpriced APIs and data that leave we don’t know where. With Scout on your Mac (or a PC with what needs in the belly), you keep all your sensitive data at home or in business while having performance that competes with cloud solutions.

Besides, they soon planned LLAMA 4 Behemothwhich will please the Satanics, and which will have 288 billion active parameters for 16 experts. Or nearly 2,000 billion parameters in total! It’s crazy! This model is even used to teach other Llama 4 models thanks to distillation techniques. On paper, he would therefore surpass GPT-4.5, Claude Sonnet 3.7 and Gemini 2.0 Pro (Benchmark Stem).

Llama 4 is therefore typically the kind of model that will allow you to write, code or make an analysis of images and documents 100% locally, as long as you have the appropriate hardware. On the other hand, Meta has set up a restrictive license on her model, which means that it is presented as open source but it is not at all. Another subject that is also debated is that the training data of these models would have contained hacked works (books & mldr;). To see also if it is politically neutral or if Llama 4 Trump support ^^.

In short, it is a good model anyway because it allows access to a powerful / intelligent LLM while sparing material and energy resources. If you are a developer, I really invite you to test it and tell me in comments what you think.

Source link

Categorized in:

Technology

Llama 4 – I tested the new Meta IA model | Unasomed

Comments

Leave a Reply Cancel reply

Previous Article

Finally an AI agent in Visual Studio Code | Dev Tools

Next Article

Monkeys.zip – When thousands of virtual monkeys try to write shakespeare | Unusual

Design tokens of Penpot – To gain consistency in your design | Dev Tools

How to distinguish if a text was written by a human or an AI? | Artificial intelligence

Learndmarc – Test and understand the safety of your mail server in 2 minutes | Cloud

Press ESC to close

Or check our Popular Categories...

Like what you read?

Subscribe to our Newsletter

Comments

Leave a Reply Cancel reply

Related Articles

Previous Article

Next Article