Name: SillyTilly/mistralai_Mistral-Nemo-Instruct-2407 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SillyTilly

Mistral-Nemo-Instruct-2407: An Overview

The Mistral-Nemo-Instruct-2407 is an instruct fine-tuned Large Language Model (LLM) developed collaboratively by Mistral AI and NVIDIA. It is based on the Mistral-Nemo-Base-2407 and is released under the Apache 2 License. This model is notable for its robust performance, often outperforming other models of similar or smaller scale.

Key Capabilities & Features

Extensive Context Window: Trained with a substantial 128k context window, allowing for processing longer inputs and maintaining coherence over extended conversations.
Multilingual & Code Proficiency: Benefits from training on a significant proportion of multilingual and code data, enhancing its versatility across different languages and programming tasks.
Strong Benchmark Performance: Achieves competitive scores on various benchmarks, including 68.0% on MMLU (5-shot), 83.5% on HellaSwag (0-shot), and 76.8% on Winogrande (0-shot). It also demonstrates solid multilingual MMLU scores across French, German, Spanish, and other languages.
Architectural Details: Features a transformer architecture with 40 layers, 5,120 dimensions, and a vocabulary size of approximately 128k, utilizing Grouped Query Attention (GQA) with 8 KV-heads.
Instruction Following & Function Calling: Designed for effective instruction following and supports function calling, making it suitable for interactive applications and tool integration.

When to Use This Model

Instruction-tuned applications: Ideal for tasks requiring precise instruction following.
Multilingual use cases: Strong performance across multiple languages makes it suitable for global applications.
Code-related tasks: Its training on code data suggests good capabilities for code generation or understanding.
As a Mistral 7B replacement: Positioned as a direct, enhanced replacement for Mistral 7B, offering improved performance and features.

Overview

Mistral-Nemo-Instruct-2407: An Overview

Key Capabilities & Features

When to Use This Model

Full Model Card (README)