Name: mistralai/Mistral-Nemo-Instruct-2407 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mistralai

Mistral-Nemo-Instruct-2407: A Powerful Instruction-Tuned LLM

Mistral-Nemo-Instruct-2407 is a 12 billion parameter instruction-tuned large language model, a collaborative effort between Mistral AI and NVIDIA. It is built upon the Mistral-Nemo-Base-2407 and is designed to offer strong performance, often outperforming models of similar or smaller scale.

Key Capabilities & Features

Architecture: Transformer model with 40 layers, 5,120 hidden dimensions, and Grouped-Query Attention (GQA) with 8 KV-heads.
Context Window: Features a substantial 128k context window, enabling processing of longer inputs and maintaining conversational coherence.
Multilingual & Code Data: Trained on a significant proportion of multilingual and code data, enhancing its versatility across different languages and programming tasks.
Instruction Following: Specifically fine-tuned for instruction following, making it highly effective for chat and command-based interactions.
Benchmarks: Achieves competitive scores on various benchmarks, including 68.0% on MMLU (5-shot), 83.5% on HellaSwag (0-shot), and strong multilingual MMLU scores (e.g., 62.3% French, 62.7% German).
Tool Use/Function Calling: Supports advanced function calling capabilities, allowing integration with external tools and APIs.
Licensing: Released under the permissive Apache 2 License.

Good For

General-purpose conversational AI: Its instruction-tuned nature makes it suitable for chatbots and interactive applications.
Multilingual applications: Strong performance on multilingual benchmarks indicates its utility for global use cases.
Code-related tasks: Training on code data suggests proficiency in code generation, understanding, and related functions.
Developers seeking a powerful, open-source alternative: Positioned as a drop-in replacement for models like Mistral 7B, offering enhanced capabilities.

Overview

Mistral-Nemo-Instruct-2407: A Powerful Instruction-Tuned LLM

Key Capabilities & Features

Good For

Full Model Card (README)