boradorish/qwen3-8b-finetuned-train

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 11, 2026License:otherArchitecture:Transformer Cold

The boradorish/qwen3-8b-finetuned-train is an 8 billion parameter language model, fine-tuned by boradorish from the Qwen/Qwen3-8B architecture. This model is specifically optimized for reasoning tasks, having been fine-tuned on the sunny_reasoning dataset. It is designed to enhance performance in logical deduction and problem-solving scenarios.

Loading preview...

Model Overview

The boradorish/qwen3-8b-finetuned-train is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned on the sunny_reasoning dataset, indicating an optimization for tasks requiring logical reasoning and problem-solving capabilities.

Key Training Details

This model was trained using the following hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: 1 (train), 8 (eval)
  • Gradient Accumulation: 4 steps, resulting in a total effective training batch size of 8.
  • Optimizer: ADAMW_TORCH_FUSED with default betas and epsilon.
  • Scheduler: Cosine learning rate scheduler with 0.1 warmup steps.
  • Epochs: 3.0
  • Frameworks: Transformers 5.2.0, Pytorch 2.9.1+cu130, Datasets 4.0.0, Tokenizers 0.22.2.

Intended Use Cases

Given its fine-tuning on a reasoning-focused dataset, this model is likely best suited for applications where strong logical inference, analytical thinking, and problem-solving are critical. Developers should consider this model for tasks that benefit from enhanced reasoning abilities.