decem/Dionysus-Mistral-m3-v5

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 30, 2023License:cc-by-4.0Architecture:Transformer Open Weights Cold

The decem/Dionysus-Mistral-m3-v5 is a 7 billion parameter language model developed by DECEM, fine-tuned using Supervised Fine-Tuning (SFT) on the Mistral architecture. This English-language model is designed for general language tasks, achieving an average score of 63.14 on the Open LLM Leaderboard, with notable performance in reasoning and common sense benchmarks. It is suitable for applications requiring robust language understanding and generation within an 8192 token context window.

Loading preview...

Model Overview

The decem/Dionysus-Mistral-m3-v5 is a 7 billion parameter language model developed by DECEM. It has been fine-tuned using Supervised Fine-Tuning (SFT) based on the Mistral architecture, primarily for English language tasks. The model operates with an 8192 token context length, making it suitable for processing moderately long inputs and generating coherent responses.

Key Capabilities & Performance

Evaluated on the Open LLM Leaderboard, Dionysus-Mistral-m3-v5 demonstrates a balanced performance across various benchmarks, achieving an overall average score of 63.14.

  • Reasoning: Scores 59.56 on AI2 Reasoning Challenge (25-Shot) and 51.02 on GSM8k (5-shot).
  • Common Sense: Achieves 80.99 on HellaSwag (10-Shot) and 75.14 on Winogrande (5-shot).
  • General Knowledge: Scores 61.18 on MMLU (5-Shot).
  • Truthfulness: Records 50.93 on TruthfulQA (0-shot).

Prompting

This model is designed to be prompted using an Alpaca-style instruction format:

### Instruction:

<prompt>

### Response:

Good For

  • General-purpose text generation and understanding.
  • Applications requiring reasoning and common sense capabilities.
  • Tasks benefiting from a 7B parameter model with an 8K context window.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p