EphAsad/Atem-3B

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 8, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

EphAsad/Atem-3B is a 3.1 billion parameter language model, a Stage 1 supervised fine-tune of Qwen2.5-3B-Instruct, developed by EphAsad. It is specifically designed for direct reasoning, mathematics, code, and general instruction following, prioritizing structured, considered responses without explicit chain-of-thought traces. This model excels in producing well-structured answers and shows a modest improvement in mathematical reasoning over its base model.

Loading preview...

Overview of Atem-3B

Atem-3B is the initial release in the 3 billion parameter series of the Atem models, developed by EphAsad. It is a Stage 1 supervised fine-tune (SFT) of the Qwen2.5-3B-Instruct base model, trained on approximately 120,000 high-quality examples. The training data spans diverse domains including mathematics, code, general reasoning, and instruction following.

Key Capabilities and Design Philosophy

  • Direct Reasoning: Atem-3B is engineered to provide direct, well-structured, and considered responses. Unlike some models, it does not produce visible chain-of-thought traces (<think> tags), as these were deliberately stripped from the training data. This design channels its reasoning capacity into the final output.
  • Enhanced Mathematical Reasoning: Leveraging a stronger 3B base and specialized mathematical distillation datasets, Atem-3B demonstrates improved performance in mathematical reasoning, as evidenced by a 2.3% gain in flexible-extract GSM8K scores compared to its base model.
  • Code and General Instruction Following: The model's training corpus includes significant portions of code and general instruction data, making it proficient in these areas.
  • Scalable Methodology: It applies the successful data curation methodology from the 1.5B Atem series to a larger parameter count, providing a more robust foundation for complex tasks.

When to Use Atem-3B

  • Applications requiring direct, structured answers: Ideal for scenarios where explicit step-by-step reasoning traces are not desired in the output, but the underlying reasoning is crucial.
  • Mathematical problem-solving: Suitable for tasks involving mathematical reasoning where accurate, structured answers are paramount.
  • Code generation and understanding: Effective for generating code and following programming-related instructions.
  • General instruction following: Can be used for a wide range of instruction-based tasks where clear and concise responses are needed.