Llama

Llama is Meta's open language model family, with weights freely downloadable under a bespoke community licence. The models can be run on the organisation's own infrastructure, which makes them a data-resident, internal language model. Llama is not open source in the strict sense, but open with conditions.

Open weights as the foundation of internal language models

A model behind a hosted cloud API is convenient, but it binds the data and the legal recourse to a foreign provider. Llama occupies the opposite position: a family of large language models whose weights can be downloaded and run on the organisation's own hardware. This page sets out what Llama covers, where the licence boundary runs, and what to watch for in the model sizes.

Provider and origin

Llama is developed by Meta, the US corporation behind Facebook, Instagram and WhatsApp. The first generation appeared in early 2023, followed by a fast release cadence. Unlike a pure API product, Meta publishes the model weights for download, so the model can run without an ongoing connection to Meta. That self-hosting is exactly why Llama plays a role in sovereign architectures even though the provider is a US corporation: once downloaded, the model runs on the organisation's own infrastructure, and the requests do not leave the organisation's own boundary.

The model family and the model sizes

Llama is not a single file but a family that has grown across several generations. The size of a model, measured in parameters, largely determines its hardware demand and its capabilities.

Generations. Llama 1 and Llama 2 (both 2023) were followed in 2024 by Llama 3, 3.1, 3.2 and 3.3. Llama 4 arrived in April 2025. Some older generations are now considered superseded; what matters is the current line.
Model sizes. The range runs from compact models of about one billion parameters (Llama 3.2 at 1B and 3B) that run on modest hardware, through mid sizes such as 8B and 70B, up to very large models such as Llama 3.1 with 405 billion parameters. Smaller models are frugal and fast; larger ones can do more but need far more memory and compute.
Llama 4 and mixture of experts. Llama 4 uses a mixture-of-experts architecture, where only part of the model is active per request. The Scout model combines around 17 billion active with 109 billion total parameters, Maverick around 17 billion active with 400 billion total parameters. The generation is multimodal, so it processes text and images.

The concrete model range changes quickly. Which models are currently available and which size fits the available hardware should be checked directly with Meta before any deployment.

The licence boundary and the term open source

With Llama the most important point lies not in the technology but in the licence. The models do not sit under a recognised open-source licence, but under a bespoke Llama Community License that Meta issues per generation.

Open weights, bespoke licence. The licence permits downloading, adapting and commercially running the weights independently. In that sense Llama is markedly more open than a pure API model. But it is tied to conditions and therefore not unrestrictedly free.
The main conditions. Anyone redistributing Llama must include the licence and an attribution notice. One well-known clause concerns very large providers: organisations with more than 700 million monthly active users need a separate licence from Meta. On top of that, an Acceptable Use Policy excludes certain purposes.
Not open source in the strict sense. The Open Source Initiative, which guards the official definition, does not classify the Llama licence as open source, because it contains restrictions on purpose and on the user base that the definition rules out. The precise term for Llama is therefore open weights, that is open weights with licence conditions, not open source.

This distinction is not pedantry. It decides whether a model may be used without legal concern in any context, and belongs checked before any production use.

Llama as the internal model in a sovereign stack

The value of open weights is that the model leaves the provider's platform. A Llama model can run on the organisation's own hardware or in a Swiss cloud with a local inference server, so no request goes to an external service. That makes Llama a possible building block of an architecture for sovereign AI, in which open weights and local inference produce a language model with no data outflow. Compared with the European provider Mistral, the difference lies mainly in the licence: Mistral places part of its models under the recognised Apache 2.0 licence, whereas Llama stays on its own community licence with conditions. Both share the decisive trait, however, that the weights can be self-hosted. Which model may run on which data is ultimately a question of governance, not of the model choice alone.

References

Open Source Initiative Meta's Llama license is still not Open Source. The reasoning for why the Llama licence does not meet the Open Source Definition. (18.02.2025). opensource.org/blog/metas-llama-license-is-still-not-open-source
Meta Llama 4 Community License Agreement. The current licence with the redistribution condition, the 700 million user clause and the reference to the Acceptable Use Policy. (2025). www.llama.com/llama4/license/
Meta Llama model overview and download. The official home of the model family with generations and sizes. (2026). www.llama.com/

Ask AI

These links open external AI services, the conversation and its content are sent to their providers.