Name: amphora/qwen3-8b-tr API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: amphora

amphora/qwen3-8b-tr: Reasoning-Optimized Qwen3-8B

This model is a specialized fine-tuned version of the Qwen/Qwen3-8B-Base architecture, developed by Qwen. It leverages an 8 billion parameter count and a substantial 32768 token context window, making it capable of handling detailed and lengthy inputs.

Key Capabilities & Training Focus

The primary focus of this model's fine-tuning was on the combined_reasoning_sft_tr dataset. This indicates an optimization for tasks requiring:

Logical Deduction: Processing information to arrive at conclusions.
Problem Solving: Addressing complex scenarios through analytical thought.
Analytical Tasks: Breaking down information and identifying relationships.

Training Details

The model was trained with specific hyperparameters to achieve its specialized performance:

Learning Rate: 4e-05
Batch Size: A total effective batch size of 128 (train_batch_size: 4, gradient_accumulation_steps: 4, num_devices: 8)
Optimizer: ADAMW_TORCH_FUSED
Scheduler: Cosine learning rate scheduler with 3 epochs.

Intended Use Cases

Given its fine-tuning on reasoning data, amphora/qwen3-8b-tr is particularly well-suited for applications where robust logical processing and analytical capabilities are paramount. This includes tasks such as:

Complex question answering
Data analysis and interpretation
Scientific or technical reasoning
Educational tools requiring logical problem-solving

Overview

amphora/qwen3-8b-tr: Reasoning-Optimized Qwen3-8B

Key Capabilities & Training Focus

Training Details

Intended Use Cases

Full Model Card (README)