laion/CoderForge-Preview-v6-1000-axolotl__Qwen3-8B-v8

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 25, 2026Architecture:Transformer Cold

laion/CoderForge-Preview-v6-1000-axolotl__Qwen3-8B-v8 is an 8 billion parameter causal language model fine-tuned from Qwen/Qwen3-8B. This model is specifically optimized for code generation and reasoning tasks, incorporating a unique training methodology that injects blocks into assistant turns and renders tool calls as native OpenHands XML. It features a substantial context length of 32768 tokens, making it suitable for complex coding challenges requiring extensive context.

Loading preview...

Model Overview

laion/CoderForge-Preview-v6-1000-axolotl__Qwen3-8B-v8 is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B model. It was developed using the Axolotl framework (version 0.16.0.dev0) and trained on a specialized dataset, laion/CoderForge-Preview-v6-1000, which includes coderforge-preview_v6_316.jsonl.

Key Capabilities & Training Insights

  • Code Generation & Reasoning: The model is specifically designed for code-related tasks, with its training data structured to include <think>REASONING</think> blocks within assistant turns. This approach aims to align with Qwen3's post-training prior, preserving long-context coherence and improving reasoning capabilities in code generation.
  • Tool Call Integration: Tool calls are rendered as native OpenHands XML (<function=NAME><parameter=K>V</parameter></function>), indicating a focus on structured output and potential for integration with external tools.
  • Extended Context Length: Trained with a sequence_len of 32768, this model is capable of handling extensive codebases and complex problem descriptions.
  • Training Configuration: The model was trained for 12 epochs with a learning rate of 1e-5, using an adamw_torch optimizer and a cosine learning rate scheduler. Gradient accumulation steps were set to 8, with a micro batch size of 1.

Intended Use Cases

This model is particularly well-suited for:

  • Advanced Code Generation: Its specialized training on the CoderForge dataset, incorporating reasoning blocks and structured tool calls, suggests strong performance in generating complex and logical code.
  • Code Understanding and Analysis: The emphasis on reasoning within the training data could make it effective for tasks requiring an understanding of code logic and problem-solving.
  • Developer Tooling: The native OpenHands XML rendering for tool calls indicates potential for integration into developer workflows that leverage external APIs or functions.