Mistral AI recently announced the release of Mistral Large 2, the latest iteration of its flagship model, which promises significant advancements over its predecessor. This new model excels in code generation, mathematics, and reasoning and offers enhanced multilingual support and advanced function-calling capabilities. Mistral Large 2 is designed to be cost-efficient, fast, and high-performing. It is available on “la Plateforme” with new features that facilitate the development of innovative AI applications. Users can experience Mistral Large 2 today on “la Plateforme” under mistral-large-2407 and test it on le Chat.
Mistral Large 2 has a 128k context window and supports multiple languages, including French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean. It also supports over 80 coding languages like Python, Java, C, C++, JavaScript, and Bash. This model is optimized for single-node inference with long-context applications in mind, boasting 123 billion parameters, which allow for high throughput on a single node. The model is released under the Mistral Research License for research and non-commercial uses, while commercial use requires a Mistral Commercial License.
The model sets a new standard in performance and cost efficiency on evaluation metrics, achieving an accuracy of 84.0% on the MMLU benchmark, thus setting a new benchmark for open models. The experience from training previous models like Codestral 22B and Codestral Mamba has contributed to Mistral Large 2’s superior performance in code generation and reasoning. It outperforms its predecessor and is competitive with leading models such as GPT-4o, Claude 3 Opus, and Llama 3 405B.
During Mistral Large 2’s training, a significant focus was enhancing its reasoning capabilities and minimizing the generation of factually incorrect or irrelevant information. The model was fine-tuned to provide accurate and reliable outputs, reflecting its improved performance on popular mathematical benchmarks. Mistral Large 2 has been trained to acknowledge when it cannot find solutions or lacks sufficient information to provide a confident answer, ensuring it remains reliable and trustworthy.
The new model also showcases remarkable improvements in instruction-following and conversational capabilities. It performs exceptionally well on benchmarks like MT-Bench, Wild Bench, and Arena Hard, demonstrating its proficiency in handling precise instructions and long multi-turn conversations. Despite the tendency of some benchmarks to favor lengthy responses, Mistral Large 2 is designed to generate concise and cost-effective outputs whenever possible, which is crucial for many business applications.
One of Mistral Large 2’s standout features is its multilingual prowess. While many models are predominantly English-centric, Mistral Large 2 was trained on a significant proportion of multilingual data. It excels in English, French, German, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, Arabic, and Hindi, making it suitable for various business use cases involving multilingual documents.
In addition to its language capabilities, Mistral Large 2 is equipped with enhanced function calling and retrieval skills. It has undergone training to execute both parallel and sequential function calls proficiently, making it a powerful engine for complex business applications. The model is available under version 24.07, and the API name is mistral-large-2407. Weights for the instruct model are also hosted on HuggingFace.
Mistral AI is consolidating its offerings on “la Plateforme” around two general-purpose models, Mistral Nemo and Mistral Large, and two specialist models, Codestral and Embed. As older models are progressively deprecated, all Apache models will remain available for deployment and fine-tuning using the SDKs mistral-inference and mistral-finetune. Fine-tuning capabilities are now extended to Mistral Large, Mistral Nemo, and Codestral.
In conclusion, Mistral AI has expanded its partnerships with leading cloud service providers to bring Mistral Large 2 to a global audience. The collaboration with Google Cloud Platform has been extended to make Mistral AI’s models available on Vertex AI via a Managed API. Mistral AI’s best models are now accessible on Vertex AI, Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai, broadening their availability and impact in the AI landscape.
Check out the Model Card and Details. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..
Don’t Forget to join our 47k+ ML SubReddit
Find Upcoming AI Webinars here
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.