Name: olaverse/MIST-Mini-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: olaverse

MIST-1-8B: A Fast and Capable Llama 3.1 Blend

MIST-1-8B, formerly MIST-Mini, is an 8 billion parameter model from the MIST model family by olaverse. It is engineered by blending four specialized Llama 3.1 8B models using the DARE+TIES method, aiming for a balance of strong performance and high speed. This model is the fastest in its family, achieving an average speed of 63 tokens/second on an H200 GPU, making it highly efficient for real-time applications.

Key Strengths

Exceptional Speed: Delivers ~63 tok/s, ideal for applications requiring rapid responses.
Strong Reasoning: Benefits from DeepSeek R1 distillation, enhancing its logical processing capabilities.
Clean Code Generation: Produces production-ready code with comments.
Accurate Math: Excels in step-by-step mathematical problem-solving.
Helpful & Low Refusal: Designed to be highly cooperative with a low refusal rate.
Lightweight: Requires only 15GB VRAM (bfloat16) or 6GB (4-bit), allowing it to run on consumer-grade GPUs like the RTX 3060+.

Use Cases

MIST-1-8B is well-suited for everyday use cases where speed and accuracy are critical, such as real-time conversational AI, code assistance, and educational tools requiring precise mathematical solutions. Its lightweight nature also makes it a strong candidate for deployment on more accessible hardware.

Overview

MIST-1-8B: A Fast and Capable Llama 3.1 Blend

Key Strengths

Use Cases

Full Model Card (README)