anthracite-org/magnum-v3-27b-kto

Warm
Public
27B
FP8
32768
2
Sep 6, 2024
License: gemma
Hugging Face
Overview

Model Overview

anthracite-org/magnum-v3-27b-kto is a 27 billion parameter language model from Anthracite, built upon IntervitensInc/gemma-2-27b-chatml. It represents the 12th iteration in a series aimed at emulating the prose quality of Claude 3 models (Sonnet and Opus).

Key Methodologies

This model was developed through a sophisticated process involving an initial Supervised Fine-Tuning (SFT) run, followed by multiple KTO (Kahneman-Tversky Optimization) re-runs. The methodology involved experimenting with various SFT and KTO ratios and merge techniques to achieve its current performance. The base SFT model, R1, was fine-tuned on a diverse set of datasets including anthracite-org/stheno-filtered-v1.1, anthracite-org/kalo-opus-instruct-22k-no-refusal, anthracite-org/nopm_claude_writing_fixed, Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned, and Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned.

Capabilities and Use Cases

  • Prose Generation: Specifically optimized to produce high-quality prose, mirroring the style and coherence found in Claude 3 models.
  • Instruction Following: Instruction-tuned using ChatML formatting, making it suitable for conversational AI and interactive applications.
  • Roleplay and Creative Writing: Includes specific templates for platforms like SillyTavern, indicating its suitability for detailed character-driven interactions and creative narrative generation.

Training Details

The model underwent two epochs of full-parameter fine-tuning on 8x H100 GPUs, sponsored by Recursal AI / Featherless AI. This training utilized the Axolotl framework.

Performance Metrics

On the Open LLM Leaderboard, the model achieved an average score of 28.90. Notable scores include 56.75 on IFEval (0-Shot) and 35.98 on MMLU-PRO (5-shot).