Magnum-v1-72b: Claude 3 Prose Replication

Magnum-v1-72b, developed by Anthracite, is a 72.7 billion parameter model built upon the Qwen-2 72B Instruct architecture. Its primary objective is to emulate the high prose quality observed in Claude 3 models, specifically Sonnet and Opus.

Key Capabilities

Advanced Prose Generation: Fine-tuned with 55 million tokens of high-quality roleplay (RP) data, the model is optimized for generating nuanced and sophisticated text.
Claude 3 Style Emulation: Designed to replicate the distinctive prose characteristics of Claude 3 models.
Instruction-Tuned: Utilizes ChatML formatting for instruction-tuned interactions, ensuring responsive and contextually appropriate outputs.

Training Details

The model underwent full-parameter fine-tuning over 1.5 epochs, leveraging 8x AMD Instinct\u2122 MI300X Accelerators. The training process focused on high-quality RP datasets to enhance its creative and conversational writing abilities.

Performance Metrics

Evaluations on the Open LLM Leaderboard show an average score of 42.21, with notable performance in IFEval (76.06) and BBH (57.65). Detailed results are available on the Open LLM Leaderboard.

Ideal Use Cases

Creative Writing: Generating stories, dialogues, and descriptive passages.
Roleplay Scenarios: Creating immersive and detailed character interactions.
Sophisticated Conversational AI: Applications requiring human-like and high-quality textual responses.

Overview

Magnum-v1-72b: Claude 3 Prose Replication

Key Capabilities

Training Details

Performance Metrics

Ideal Use Cases

Full Model Card (README)