Name: Infermatic/magnum-v4-72b-FP8-Dynamic API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Infermatic

Infermatic/magnum-v4-72b-FP8-Dynamic Overview

This model is a 72.7 billion parameter language model, dynamically quantized to FP8 using AutoFP8, based on the anthracite-org/magnum-v4-72b base model. It is fine-tuned on top of Qwen2.5-72B-Instruct with a primary objective to replicate the prose quality found in Claude 3 models, specifically Sonnet and Opus.

Key Capabilities & Features

Claude 3 Prose Quality: Specifically designed and fine-tuned to emulate the high-quality, nuanced prose style of Claude 3 Sonnet and Opus.
Dynamic FP8 Quantization: Utilizes dynamic FP8 quantization for efficient inference while maintaining performance.
Base Model: Built upon the robust Qwen2.5-72B-Instruct architecture.
Extensive Training Data: Fine-tuned using a diverse set of datasets, including anthracite-org/c2_logs_32k_llama3_qwen2_v1.2, anthracite-org/kalo-opus-instruct-22k-no-refusal, and others, focusing on conversational and instructional data.
ChatML Prompting: Supports the ChatML format for structured conversations, including system, user, and assistant roles.

Ideal Use Cases

Advanced Conversational AI: Excellent for chatbots and virtual assistants requiring sophisticated and human-like dialogue generation.
Creative Writing & Roleplay: Well-suited for applications demanding high-quality prose, storytelling, and character-driven interactions.
Prose Generation: Any task where generating text with a refined and nuanced style is critical.
Resource-Efficient Deployment: The FP8 quantization makes it a strong candidate for deployment scenarios where memory and computational efficiency are important for a 72B model.

Overview

Infermatic/magnum-v4-72b-FP8-Dynamic Overview

Key Capabilities & Features

Ideal Use Cases

Full Model Card (README)