boradorish/qwen3-4b-new

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 15, 2026License:otherArchitecture:Transformer Warm

boradorish/qwen3-4b-new is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B, developed by boradorish. This model is specifically optimized for reasoning tasks, having been fine-tuned on the sunny_reasoning dataset. It offers a context length of 32768 tokens, making it suitable for applications requiring enhanced logical inference and problem-solving capabilities.

Loading preview...

boradorish/qwen3-4b-new: Reasoning-Optimized Qwen3-4B

This model, developed by boradorish, is a fine-tuned variant of the Qwen/Qwen3-4B base model. It leverages the 4 billion parameter architecture of Qwen3 and is specifically enhanced for reasoning tasks.

Key Capabilities

  • Reasoning Focus: Fine-tuned on the sunny_reasoning dataset, indicating an optimization for logical inference and problem-solving.
  • Base Model: Built upon the robust Qwen3-4B architecture, providing a strong foundation for language understanding and generation.
  • Context Window: Supports a substantial context length of 32768 tokens, allowing for processing and understanding longer inputs relevant to complex reasoning scenarios.

Training Details

The model was trained with a learning rate of 2e-05, a total batch size of 64, and for 3 epochs. It utilized a multi-GPU setup with AdamW optimizer and a cosine learning rate scheduler.

Good For

  • Applications requiring enhanced logical reasoning.
  • Tasks benefiting from a model specifically trained on reasoning datasets.
  • Scenarios where a 4 billion parameter model with a large context window is advantageous for processing complex prompts.