LLaMA

ai.meta.com/llama/

Is a

Software

Product

Product attributes

Industry

Generative AI

Artificial Intelligence (AI)

Product Parent Company

Competitors

Also Known As

LLaMA 2

Technologies Used

Transformer

Software attributes

Repository URL

github.com/facebookr...arch/llama

First Release

February 18, 2023

Created/Discovered by

Meta AI

Other attributes

Announcement URL

ai.facebook.com/blog/la...meta-ai/

Overview

LLaMA (Large Language Model Meta AI) is a foundational large language model (LLM) released by Meta AI. LLaMa is designed to help researchers advance the field of AI, through access to smaller more performant models without the need for large amounts of infrastructure and computing power.

LLaMA 1

Meta first released LLaMA in February 2023. Unlike language models from Open AI/Microsoft and Google, which are conversational chatbots, LLaMA is not a system users can talk to; it is a tool to help researchers working in the field. Meta is releasing LLaMA under a noncommercial license focused on research use cases, with access granted to groups like universities, NGOs, and industry labs. Like other large language models, LLaMA takes a sequence of words as an input to predict the next word, recursively generating text. LLaMA was trained using text from twenty languages, focusing on those with Latin and Cyrillic alphabets. In the announcement of LLaMA, Meta stated:

Models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field.

As the size of the model impacts the computing power and resources required to test new approaches, Meta is making LLaMA available in several sizes:

7 billion
13 billion
33 billion
65 billion

The release came alongside a paper with more details on the model titled "LLaMA: Open and Efficient Foundation Language Models." In the paper, Meta claims the 13 billion parameter model (LLaMA-13B) performs better than OpenAI’s popular GPT-3 model on most benchmarks, while the largest model, LLaMA-65B, is “competitive with the best models,” such as DeepMind’s Chinchilla70B and Google’s PaLM 540B.

4chan leak

A week after the first announcement of LLaMA on March 3, 2023, the model was leaked. A downloadable torrent of the system was posted on 4chan before spreading to other online AI communities. On March 6, 2023, Meta announced it would continue to release its AI tools to approved researchers despite the leak to unauthorized users. In a statement, the company said:

While the model is not accessible to all, and some have tried to circumvent the approval process, we believe the current release strategy allows us to balance responsibility and openness,

Llama 2

On July 18, 2023, Meta and Microsoft introduced the Llama 2 family of models, a group of open-source LLMs free to use for research and commercial use. Llama 2 represents an expansion of the partnership between Microsoft and Meta, with availability through the Azure AI model catalog and optimization for running locally on Windows. Llama 2 is also available via AWS, Hugging Face, and other providers. Meta opened access to Llama 2 with the support of a broad set of companies and researchers across tech, academia, and policy who believe in open innovation of AI technologies.

The release includes model weights and starting code for pre-trained and fine-tuned Llama models— ranging from 7B to 70B parameters. Llama 2 was trained on 40 percent more data compared to Llama 2. Llama 2 was pre-trained on publicly available online data, totaling over 2 trillion tokens. The fine-tuned version of the model (for dialogue use), Llama-2-chat, uses reinforcement learning from human feedback (RLHF), leveraging publicly available instruction datasets and over 1 million human annotations.

As part of the Llama 2 release, each model comes with the following:

Model code
Model weights
User guide
Responsible use guide
License
Acceptable use policy
Model card

Llama 2 models have a context window of 4,096 tokens. While Llama 2 does not reach the same performance as GPT-4, Meta research shows it performs well against other open-source models. Use cases for Llama 2 focus on commercial and research in English. Tuned models are intended for assistant-like chatbots, while pre-trained models can be adapted for a wide range of natural language tasks.

Timeline

No Timeline data yet.

Further Resources

Title

Author

Link

Type

Date

Llama 2: Open Foundation and Fine-Tuned Chat Models | Meta AI Research

Hugo Touvron, et al

https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/

Web

July 18, 2023

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

https://arxiv.org/abs/2302.13971

February 27, 2023

LLaMA

Contents

Product attributes

Software attributes

Other attributes

Timeline

Further Resources

References

Find more entities like LLaMA