EphAsad/Atem-0.6B

TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 21, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Atem-0.6B by EphAsad is a 0.6 billion parameter reasoning model built on Qwen/Qwen3-0.6B, fine-tuned via multi-source knowledge distillation. It specializes in providing concise, directly-formatted answers for tasks like code explanation, mathematical problem-solving, and analytical reasoning. This model is optimized for lightweight, open-ended reasoning where speed and low compute cost are prioritized over deep, multi-step chain-of-thought processes.

Loading preview...

Overview

Atem-0.6B is a 0.6 billion parameter reasoning model developed by EphAsad, fine-tuned from Qwen/Qwen3-0.6B using LoRA. It was trained on approximately 120,000 distilled examples from multiple frontier teacher models, with a focus on producing clean, directly-formatted final answers by suppressing explicit chain-of-thought traces. This model represents Stage 1 of a multi-stage training series, laying a foundation for more complex reasoning capabilities in future iterations.

Key Capabilities

  • Direct Reasoning: Provides concise and structured answers for various analytical tasks.
  • Code Assistance: Excels in code explanation, implementation, and debugging.
  • Mathematical Problem Solving: Capable of solving mathematical problems with working shown, demonstrating a notable gain on GSM8K benchmarks due to its direct answer formatting.
  • Analytical Tasks: Suitable for analytical reasoning, hypothesis evaluation, and concept explanation.

Intended Use Cases

Atem-0.6B is designed for scenarios requiring efficient, low-compute reasoning, where direct and structured outputs are beneficial. It is particularly well-suited for:

  • Lightweight, open-ended reasoning tasks.
  • Applications where speed and a small footprint are more critical than deep, multi-step reasoning on complex problems.
  • Tasks that benefit from suppressed thinking traces, leading to more direct responses.

Limitations

As a Stage 1 model, Atem-0.6B deliberately suppresses thinking traces, which can lead to reduced accuracy on multi-step problems where the base model's exposed reasoning might self-correct. Its 0.6B parameter count also means a smaller capability ceiling compared to larger models, and it may exhibit mathematical precision issues on complex calculations without a scratchpad.