allura-org/Mistral-Small-24b-Sertraline-0304

Warm
Public
24B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Model Overview

allura-org/Mistral-Small-24b-Sertraline-0304 is a 24 billion parameter instruction-tuned model, building upon the Mistral Small 3 architecture. It is designed to offer a capable and reliable solution for various instruction-following tasks, distinguishing itself as an "actually decent" SFT (Supervised Fine-Tuning) variant.

Key Capabilities

  • Instruction Following: Fine-tuned using the v7-Tekken instruct template, similar to the original Mistral instruct models, ensuring strong adherence to user prompts.
  • Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and maintaining conversational coherence.
  • Reasoning Support: Tested with Claude-like system prompts, including specific recommendations for enhancing reasoning capabilities by forcing thought processes within <think> tags.

Training Details

This model was trained on the allura-org/inkstructmix-v0.2.1 dataset, which contributes to its instruction-following proficiency. The training methodology focuses on supervised fine-tuning to optimize its performance as an AI assistant.

Recommended Use Cases

This model is suitable for general-purpose AI assistant applications where reliable instruction following and a decent level of performance are required. Its support for structured reasoning prompts makes it potentially useful for tasks requiring step-by-step thought processes.