a-m-team/AM-Thinking-v1
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:May 10, 2025License:apache-2.0Architecture:Transformer0.2K Open Weights Warm

AM-Thinking-v1 is a 32 billion parameter dense language model developed by a-m-team, built upon the Qwen 2.5-32B-Base architecture. This model is specifically optimized for advanced reasoning tasks, demonstrating performance comparable to much larger Mixture-of-Experts (MoE) models while remaining deployable on a single high-end GPU. It excels in areas such as code generation, logical problem-solving, and creative writing, making it suitable for applications requiring strong analytical and generative capabilities.

Loading preview...

AM-Thinking-v1: A 32B Reasoning Powerhouse

AM-Thinking-v1, developed by a-m-team, is a 32 billion parameter dense language model engineered to push the boundaries of reasoning capabilities. Built on the open-source Qwen 2.5-32B-Base, this model achieves strong performance on complex reasoning benchmarks, rivaling larger MoE models like DeepSeek-R1 and Qwen3-235B-A22B, despite being significantly smaller.

Key Differentiators & Capabilities

  • Exceptional Reasoning at 32B Scale: Outperforms DeepSeek-R1 on AIME’24/’25 & LiveCodeBench and approaches Qwen3-235B-A22B, demonstrating flagship-level reasoning from a dense model.
  • Efficient Deployment: Designed to fit on a single A100-80GB GPU with deterministic latency, avoiding the overhead of MoE routing.
  • Advanced Post-Training Pipeline: Utilizes a sophisticated dual-stage RL (Reinforcement Learning) scheme, including pass-rate-aware data curation, to enhance its "think-then-answer" behavioral pattern.
  • Open-Source Foundation: Fully based on publicly available components, including Qwen 2.5-32B-Base and RL training queries.

Ideal Use Cases

  • Code Generation: Capable of generating complex scripts and handling collision detection, as demonstrated by examples like a bouncing ball simulation.
  • Logical Problem Solving: Excels in tasks requiring intricate logical deduction.
  • Creative Writing: Shows proficiency in generating coherent and contextually relevant text.

Limitations

AM-Thinking-v1 is not yet trained for structured function-calling or tool-use workflows, limiting its application in agent-style systems that interact with external systems. Its safety alignment is also in early stages, requiring further rigorous red-teaming.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p