boradorish/qwen3-0.6b-fc

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:May 14, 2026License:otherArchitecture:Transformer Warm

The boradorish/qwen3-0.6b-fc model is a fine-tuned version of the Qwen3-0.6B architecture, featuring 0.8 billion parameters and a 32768-token context length. This model has been specifically fine-tuned on the sunny_reasoning dataset, suggesting an optimization for reasoning tasks. It is intended for applications requiring a compact yet capable language model with enhanced reasoning abilities.

Loading preview...

Model Overview

The boradorish/qwen3-0.6b-fc is a fine-tuned language model based on the Qwen/Qwen3-0.6B architecture. With approximately 0.8 billion parameters and a substantial 32768-token context window, this model is designed for efficient processing of longer sequences.

Key Characteristics

  • Base Model: Qwen/Qwen3-0.6B
  • Parameter Count: 0.8 billion
  • Context Length: 32768 tokens
  • Fine-tuning Dataset: sunny_reasoning

Training Details

The model was trained using the following hyperparameters:

  • Learning Rate: 4e-05
  • Batch Size: 4 (train), 8 (eval)
  • Gradient Accumulation: 8 steps, leading to a total effective batch size of 64
  • Optimizer: AdamW (fused) with betas=(0.9, 0.999) and epsilon=1e-08
  • LR Scheduler: Cosine with 0.1 warmup steps
  • Epochs: 3.0

Intended Use

While specific intended uses and limitations require more detailed information, the fine-tuning on the sunny_reasoning dataset suggests its primary strength lies in tasks that involve logical deduction, problem-solving, and understanding complex relationships within text. Developers looking for a compact model with enhanced reasoning capabilities for specific applications may find this model suitable.