Name: Michael-Kozu/Deimos-A4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Michael-Kozu

Overview

Michael-Kozu's Deimos-A4 is a 4.66 billion parameter language model, fine-tuned from Qwen3.5-4B, specifically designed for complex reasoning and mathematical problem-solving. Its core innovation lies in an internal "terse, concise" chain-of-thought process, where the model generates compact reasoning within <think>...</think> blocks before producing a polished, user-facing response. This approach leads to significant efficiency gains, with approximately 60% fewer tokens and 36% faster inference compared to the base Qwen3.5-4B model, while also boosting accuracy on challenging math problems by over 40 points.

Key Capabilities

Optimized for Hard Math: Achieves substantial accuracy improvements on benchmarks like AIME, MATH-hard, and minerva_math500, making it suitable for multi-step proofs and algebraic chains.
Efficient Reasoning: Employs a unique internal concise reasoning mechanism, reducing token usage and accelerating inference speed.
Configurable Thinking Mode: Features an enable_thinking toggle in its chat template, allowing users to control the generation of internal reasoning traces based on task requirements.

Use Cases

Hard Math & Reasoning: Ideal for problems requiring deep, multi-step logical deduction, such as advanced algebra, geometry, and number theory.
Code Generation: While not explicitly detailed, the training pipeline includes code prompts, suggesting potential for code-related reasoning tasks.

Limitations

Specialist Model: Deimos-A4 is a specialist and not a general-purpose replacement for its base model. It shows some knowledge regression on general knowledge (MMLU) and strict instruction-following tasks (IFEval) where its concise reasoning style can conflict with rigid formatting rules.
Internal Reasoning: The compact reasoning fragments are intended to remain internal; stripping the chat template or forcing generation outside the <think> block may lead to suboptimal results.

Training Details

Deimos-A4 was trained via length-biased rejection sampling on 4,338 verified "shortest-correct" traces, curated from its predecessor, Deimos-A1. This process taught the model to eliminate filler while maintaining logical structure. The model is released under the Apache 2.0 License.

Overview

Overview

Key Capabilities

Use Cases

Limitations

Training Details

Full Model Card (README)