Name: laion/CoderForge-Preview-v3-316-axolotl__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

laion/CoderForge-Preview-v3-316-axolotl__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model was developed using the axolotl framework, specifically version 0.16.0.dev0, and leverages a pre-tokenized dataset for its training process.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen3-8B.
Parameter Count: 8 billion parameters.
Context Length: Supports a substantial context length of 32768 tokens, matching the truncation length used in the SERA v3 configuration.
Training Dataset: Trained on the laion/CoderForge-Preview-v3-316 dataset, which is pre-tokenized, allowing axolotl to bypass chat template rendering for efficiency.
Optimization: Training hyperparameters, including a learning rate of 1e-5 and adamw_torch optimizer with cosine LR scheduler, were configured to match upstream SERA settings for consistent comparisons.

Training Details

The model underwent 9 training steps with a total batch size of 32 (micro batch size 1, gradient accumulation steps 8). It utilized bf16 precision and flash_attention for optimized performance. The training process was conducted on 4 GPUs.

Intended Use

While specific intended uses and limitations are not detailed in the provided README, the model's training on the CoderForge dataset suggests a strong focus on code generation, understanding, and related programming tasks. Its large context window makes it suitable for handling extensive codebases or complex programming problems.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use

Full Model Card (README)