LumosJiang/Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-600steps

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 22, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

LumosJiang/Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-600steps is an 8 billion parameter Qwen3-Base model fine-tuned by LumosJiang. It specializes in code generation and reasoning, trained on a distilled code subset of the AM-Thinking-v1 dataset. This model is optimized for generating Python code and includes a unique `...` reasoning protocol, making it suitable for complex coding tasks requiring explicit thought processes.

Loading preview...

Model Overview

This model, LumosJiang/Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-600steps, is an 8 billion parameter variant of the Qwen3-Base architecture. It has been specifically fine-tuned for code generation and reasoning capabilities. The training utilized a highly curated code subset (approximately 300,000 samples with verify_score ≥ 0.9) from the AM-Thinking-v1-Distilled dataset.

Key Capabilities

  • Specialized Code Generation: Optimized for producing high-quality code, particularly Python, as demonstrated by its training data.
  • Reasoning Protocol: Incorporates a unique <think>...</think> reasoning protocol, allowing the model to explicitly output its thought process before generating the final code block. This can be valuable for debugging and understanding the model's approach.
  • Qwen3 Chat Template: Utilizes the standard Qwen3 chat template for interaction, ensuring compatibility and ease of use within the Qwen ecosystem.
  • Extended Context Window: Supports a maximum sequence length of 32768 tokens, enabling it to handle larger codebases or more complex problem descriptions.

When to Use This Model

This model is ideal for developers and researchers focused on:

  • Automated Code Generation: Generating Python functions or code snippets based on natural language prompts.
  • Code Understanding and Explanation: Leveraging the explicit reasoning output to gain insights into the model's problem-solving steps.
  • Integration into Coding Assistants: Building tools that require a robust code generation backend with a focus on logical reasoning.
  • Tasks requiring a Qwen3-based model with strong coding aptitude.