Log in
Enquire now
Mistral 7B

Mistral 7B

Mistral 7B is a 7.3 billion parameter large language model, the first foundational model developed by the French AI company Mistral AI.

OverviewStructured DataIssuesContributors

Contents

mistral.ai/product/
Is a
Product
Product

Product attributes

Industry
Generative AI
Generative AI
Artificial Intelligence (AI)
Artificial Intelligence (AI)
Launch Date
September 27, 2023
0
Product Parent Company
Mistral AI
Mistral AI
0
Competitors
Gemma (Google)
Gemma (Google)
0
LLaMA
LLaMA
0
Technologies Used
Transformer
Transformer
0

Other attributes

Announcement URL
mistral.ai/news/anno...istral-7b/
Overview

Mistral 7B is a 7.3 billion parameter large language model (LLM), the first foundational model developed by the French AI company Mistral AI. Mistral-7B-v0.1 was released on September 27, 2023, under the Apache 2.0 license and can be used without restrictions. The model was followed by a technical paper submitted on October 10, 2023. The company describes Mistral 7B as a "small, yet powerful model adaptable to many use-cases;" these include text summarisation, classification, text completion, and code completion. The model has language and coding capabilities, an 8k context length, and can be customized.

Mistral AI has stated that Mistral 7B:

  • Outperforms Llama 2 13B on all benchmarks
  • Outperforms Llama 1 34B on many benchmarks
  • Approaches CodeLlama 7B performance on code

Mistral 7B leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. SWA exploits the stacked layers of a transformer to attend in the past beyond the window size, meaning higher layers have access to information further in the past than what the attention patterns seem to entail. A fixed attention span means the model can limit the cache to the size of sliding window tokens, using rotating buffers. This saves half of the cache memory for inference on sequence length of 8192, without impacting model quality.

To demonstrate the fine-tuning of Mistral 7B, the company also released Mistral 7B instruct trained on instruction datasets publicly available on HuggingFace. The outcome is a model that Mistral AI states outperforms all 7B models on MT-Bench, giving comparable performance to 13B chat models.

Mistral 7B can be downloaded and run anywhere using the accompanying reference implementation documentation, including locally. The model can be deployed on any cloud, using vLLM inference server and skypilot. Mistral 7B is also available on HuggingFace.

Performance

Mistral AI released a comparison of MIstral 7B to the Llama 2 family of models, re-running all model evaluations themselves. Benchmarks used for evaluation include the following:

  • Commonsense reasoning—0-shot average of Hellaswag, Winogrande, PIQA, SIQA, OpenbookQA, ARC-Easy, ARC-Challenge, and CommonsenseQA.
  • World knowledge—5-shot average of NaturalQuestions and TriviaQA.
  • Reading comprehension—0-shot average of BoolQ and QuAC.
  • Math—Average of 8-shot GSM8K with maj@8 and 4-shot MATH with maj@4
  • Code—Average of 0-shot Humaneval and 3-shot MBPP
  • Popular aggregated results—5-shot MMLU, 3-shot BBH, and 3-5-shot AGI Eval (English multiple-choice questions only)
Results comparing Mistral 7B performance to three Llama models.

Results comparing Mistral 7B performance to three Llama models.

Timeline

No Timeline data yet.

Further Resources

Title
Author
Link
Type
Date

Mistral 7B

Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

https://arxiv.org/abs/2310.06825

October 10, 2023

References

Find more entities like Mistral 7B

Use the Golden Query Tool to find similar entities by any field in the Knowledge Graph, including industry, location, and more.
Open Query Tool
Access by API
Golden Query Tool
Golden logo

Company

  • Home
  • Press & Media
  • Blog
  • Careers
  • WE'RE HIRING

Products

  • Knowledge Graph
  • Query Tool
  • Data Requests
  • Knowledge Storage
  • API
  • Pricing
  • Enterprise
  • ChatGPT Plugin

Legal

  • Terms of Service
  • Enterprise Terms of Service
  • Privacy Policy

Help

  • Help center
  • API Documentation
  • Contact Us
By using this site, you agree to our Terms of Service.