Name: LumosJiang/Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-600steps API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: LumosJiang

Model Overview

This model, LumosJiang/Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-600steps, is an 8 billion parameter variant of the Qwen3-Base architecture. It has been specifically fine-tuned for code generation and reasoning capabilities. The training utilized a highly curated code subset (approximately 300,000 samples with verify_score ≥ 0.9) from the AM-Thinking-v1-Distilled dataset.

Key Capabilities

Specialized Code Generation: Optimized for producing high-quality code, particularly Python, as demonstrated by its training data.
Reasoning Protocol: Incorporates a unique <think>...</think> reasoning protocol, allowing the model to explicitly output its thought process before generating the final code block. This can be valuable for debugging and understanding the model's approach.
Qwen3 Chat Template: Utilizes the standard Qwen3 chat template for interaction, ensuring compatibility and ease of use within the Qwen ecosystem.
Extended Context Window: Supports a maximum sequence length of 32768 tokens, enabling it to handle larger codebases or more complex problem descriptions.

When to Use This Model

This model is ideal for developers and researchers focused on:

Automated Code Generation: Generating Python functions or code snippets based on natural language prompts.
Code Understanding and Explanation: Leveraging the explicit reasoning output to gain insights into the model's problem-solving steps.
Integration into Coding Assistants: Building tools that require a robust code generation backend with a focus on logical reasoning.
Tasks requiring a Qwen3-based model with strong coding aptitude.

Overview

Model Overview

Key Capabilities

When to Use This Model

Full Model Card (README)