Name: boradorish/qwen3-0.6b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: boradorish

Overview

The boradorish/qwen3-0.6b model is a specialized language model, fine-tuned from the base Qwen/Qwen3-0.6B architecture. With 0.8 billion parameters and a substantial context length of 32768 tokens, this model is designed for tasks that benefit from deep contextual understanding.

Key Capabilities

Reasoning Focus: The model has undergone fine-tuning on the sunny_reasoning dataset, suggesting an enhanced capability for logical inference and problem-solving tasks.
Extended Context: Its 32768-token context window allows for processing and generating longer sequences of text, beneficial for complex queries or document analysis.

Training Details

The fine-tuning process utilized specific hyperparameters to optimize performance:

Learning Rate: 4e-05
Batch Size: A total training batch size of 64 (with train_batch_size 4 and gradient_accumulation_steps 8) across 2 multi-GPU devices.
Optimizer: ADAMW_TORCH_FUSED with default betas and epsilon.
Scheduler: Cosine learning rate scheduler with 0.1 warmup steps over 3 epochs.

Good For

Applications requiring strong reasoning capabilities.
Tasks that benefit from processing and generating long text sequences.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)