Name: amphora/qwen3-8b-base-30k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: amphora

Model Overview

This model, amphora/qwen3-8b-base-30k, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B-Base architecture. It has been fine-tuned on the combined_reasoning_sft_lt30k dataset, suggesting a specialization in tasks requiring strong reasoning capabilities.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen3-8B-Base.
Parameter Count: 8 billion parameters.
Context Length: Supports a substantial context window of 32768 tokens, enabling the processing of long and complex inputs.
Training Data: Fine-tuned on the combined_reasoning_sft_lt30k dataset, indicating a focus on reasoning-oriented tasks.

Training Details

The model was trained with a learning rate of 4e-05, a batch size of 8 (total train batch size of 128 with gradient accumulation), and utilized the ADAMW_TORCH_FUSED optimizer. Training spanned 3 epochs with a cosine learning rate scheduler. The training environment included Transformers 5.2.0, Pytorch 2.11.0+cu130, Datasets 4.0.0, and Tokenizers 0.22.2.

Potential Use Cases

Given its fine-tuning on a reasoning-focused dataset and large context window, this model is likely well-suited for applications requiring:

Complex problem-solving.
Logical inference and deduction.
Understanding and generating coherent, extended narratives or arguments.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)