Product attributes
Other attributes
Mixtral 8x7B is a sparse mixture of experts model (SMoE) developed by Mistral AI. Mixtral has 46.7 billion total parameters but only uses 12.9 billion parameters per token. This approach aims to increase the number of parameters while reducing cost and latency, processing inputs and generating outputs at the same speed and cost as a 12.9 billion parameter model. Mixtral has a 32k token context window and support for multiple languages (English, French, Italian, German and Spanish). The model shows strong performance in code generation as well as language tasks. Minstral AI states Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference, making it the "strongest open-weight model with a permissive license." The company also states it matches or outperforms GPT3.5 on most standard benchmarks.
Results published by Mistral AI show that Mixtral matches or outperforms Llama 2 70B and GPT3.5, on most benchmarks.
Mixtral is a sparse mixture-of-experts network. It is a decoder-only model that picks from a set of eight distinct groups of parameters, giving it the designation "8x7B." At each layer, for every token, a router network processes the token by choosing two of these groups and combining their output additively. The model is pre-trained on data extracted from the open Web with experts and routers trained simultaneously. CoreWeave and Scaleway provided technical support during the training of Mixtral.
Mixtral can be finetuned into an instruction-following model, and Mistal AI released Mixtral 8x7B Instruct alongside the original model. Mixtral 8x7B Instruct has gone through supervised fine-tuning and direct preference optimisation (DPO) for instruction following. On reaches a score of 8.30 on MT-Bench. Mistral AI states this score makes it "the best open-source model, with a performance comparable to GPT3.5."
The model was released on December 11, 2023, with open weights. Mixtral 8x7B is licensed under Apache 2.0. Upon release, users can access Mixtral 8x7B through Mistral AI's "mistral-small" endpoint available in beta or download it from the Hugging Face repository. Users can deploy Mixtral with a fully open-source deployment stack and Mistral AI have submitted changes to the vLLM project, which integrates Megablocks CUDA kernels for efficient inference.