boradorish/qwen3-4b-new-prompt
The boradorish/qwen3-4b-new-prompt is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B. This model is specifically optimized for reasoning tasks, having been trained on the sunny_reasoning dataset. It aims to enhance logical inference and problem-solving capabilities within its 32768 token context window. This fine-tuned variant is best suited for applications requiring improved analytical and reasoning performance.
Loading preview...
Model Overview
The boradorish/qwen3-4b-new-prompt is a 4 billion parameter language model, fine-tuned from the base Qwen/Qwen3-4B architecture. Its primary differentiation lies in its specialized training on the sunny_reasoning dataset, indicating an optimization for tasks that require strong reasoning and analytical capabilities.
Key Training Details
The model was trained using the following hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 1 (train), 8 (eval) with 32 gradient accumulation steps, resulting in a total effective batch size of 64.
- Optimizer: ADAMW_TORCH_FUSED
- Scheduler: Cosine learning rate scheduler with 0.1 warmup steps.
- Epochs: 3.0
Intended Use Cases
Given its fine-tuning on a reasoning-focused dataset, this model is particularly well-suited for:
- Logical Inference: Tasks requiring the model to draw conclusions from given information.
- Problem Solving: Applications where the model needs to analyze scenarios and propose solutions.
- Analytical Tasks: Use cases demanding structured thought and deduction.
This model leverages a 32768 token context length, providing ample capacity for complex reasoning prompts.