Michael-Kozu/Deimos-A1

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Apr 25, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

Michael-Kozu/Deimos-A1 is a 4.5 billion parameter language model, fine-tuned from Qwen3.5-4B by Michael-Kozu. It specializes in concise chain-of-thought (CCoT) reasoning, producing dense, stepwise blocks that average ~1/8 the tokens of its base model while improving accuracy on reasoning benchmarks. This model is optimized for efficient, accurate reasoning tasks, offering significantly faster wall-clock performance.

Loading preview...

Deimos A1: Concise Chain-of-Thought Reasoning

Deimos A1, developed by Michael-Kozu, is a 4.5 billion parameter model fine-tuned from Qwen3.5-4B. Its primary innovation lies in its concise chain-of-thought (CCoT) capabilities, generating dense, stepwise <think> blocks that are approximately 1/8 the token length of the base model. This efficiency translates to significantly faster wall-clock performance, with benchmarks showing it to be ~6 times faster for full benchmark runs.

Key Capabilities & Features

  • Efficient Reasoning: Produces highly compressed reasoning traces, reducing output token count by ~89% per problem compared to its base model.
  • Improved Accuracy: Demonstrates improved accuracy on reasoning benchmarks, despite the reduced token count for thought processes.
  • Optimized Training: Trained on the Michael-Kozu/Quark dataset, a 4,919-row CCoT SFT dataset with compressed reasoning traces.
  • Qwen3.5 Architecture: Leverages the Qwen3.5-4B base model's architecture, including Gated DeltaNet and sparse attention.

When to Use Deimos A1

  • Resource-Constrained Environments: Ideal for applications where token efficiency and inference speed are critical.
  • Reasoning-Intensive Tasks: Suited for tasks requiring structured, step-by-step reasoning where concise outputs are beneficial.
  • English-Only Applications: Currently optimized for English language tasks.

While comprehensive accuracy evaluations are ongoing, initial results highlight its token efficiency and speed advantages. The model is released under the MIT License.