Name: olaverse/MIST-1-70B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: olaverse

MIST-1-70B Overview

MIST-1-70B is a 70 billion parameter model developed by olaverse, part of their MIST model family. It is constructed by merging four top Llama 3.1 70B models using a DARE+TIES method, which prunes redundant weights and resolves conflicts to combine their best capabilities. This approach results in a model optimized for structured, detailed, and production-ready outputs.

Key Capabilities

Strong Reasoning: Achieved through DeepSeek R1 distillation at the 70B scale.
High Helpfulness: Built upon Nemotron, which ranks highly in helpfulness benchmarks.
Coding Proficiency: Generates clean, documented, and production-ready code.
Mathematical Problem Solving: Provides step-by-step, structured solutions with verification.
Multilingual Support: Capable of handling 8+ languages.
Long Context Window: Features an extensive 128K token context window.
Unrestricted Responses: Designed to follow instructions without excessive refusals.

Usage and Hardware

The model supports both bfloat16 (requiring 140GB VRAM) and 4-bit quantized (requiring 40GB VRAM) precision. Users are strongly advised to use the apply_chat_template function for prompt formatting to ensure correct model behavior and avoid issues like <|im_end|> token leakage, as the model's tokenizer is based on Llama 3.1 format despite its mixed training heritage.

Overview

MIST-1-70B Overview

Key Capabilities

Usage and Hardware

Full Model Card (README)