olaverse/MIST-Mini-8B
MIST-Mini-8B, developed by olaverse, is the smallest and fastest model in the MIST family, built by blending four specialized Llama 3.1 8B models using DARE+TIES. This 8 billion parameter model is optimized for speed, achieving 63 tokens/second, while offering strong reasoning, clean code generation, and accurate mathematical problem-solving. It is designed to be lightweight, requiring only 15GB VRAM for bfloat16 precision, making it suitable for real-time applications and consumer-grade GPUs.
Loading preview...
MIST-Mini-8B: Fast and Capable 8B Model
MIST-Mini-8B, now known as MIST-1-8B, is the most compact and rapid offering within the MIST model family by olaverse. It is constructed by merging four specialized Llama 3.1 8B models using the DARE+TIES technique, aiming for a balance of performance and speed.
Key Capabilities
- Exceptional Speed: Achieves approximately 63 tokens/second on an H200, making it highly suitable for real-time applications.
- Strong Reasoning: Benefits from DeepSeek R1 distillation, contributing to robust reasoning abilities.
- Code Generation: Produces clean, production-ready code with comments.
- Mathematical Proficiency: Capable of accurate, step-by-step mathematical problem-solving.
- Helpful & Lightweight: Exhibits a low refusal rate and is lightweight, requiring only 15GB VRAM for bfloat16, allowing it to run on consumer GPUs like the RTX 3090/4090.
Hardware Requirements
This model supports bfloat16 precision with 16GB VRAM (e.g., RTX 3090/4090) and 4-bit quantization requiring 6GB VRAM (e.g., RTX 3060+).