Name: ferrazzipietro/Qwen3-8B-reas-int-065-only-loss-noprompt-3epoch-baseline API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ferrazzipietro

Model Overview

This model, ferrazzipietro/Qwen3-8B-reas-int-065-only-loss-noprompt-3epoch-baseline, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B base architecture. It has undergone a fine-tuning process over 3 epochs, utilizing a multi-GPU setup with 2 devices and a total training batch size of 64. The training employed an AdamW optimizer with specific beta and epsilon values, and a cosine learning rate scheduler with a 0.1 warmup ratio.

Key Training Details

Base Model: Qwen/Qwen3-8B
Parameters: 8 Billion
Epochs: 3
Learning Rate: 5e-06
Optimizer: AdamW_TORCH (betas=(0.9, 0.95), epsilon=1e-12)
Scheduler: Cosine with 0.1 warmup ratio
Total Batch Size: 64 (train and eval)

Current Limitations

As per the provided information, details regarding the specific fine-tuning dataset, the model's intended uses, and its limitations are currently not available. This means its primary differentiators or optimized capabilities compared to the base Qwen3-8B model are not explicitly stated. Users should be aware that without further information, its suitability for specific tasks is undefined.

Overview

Model Overview

Key Training Details

Current Limitations

Full Model Card (README)