boradorish/qwen3-4b-new
boradorish/qwen3-4b-new is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B, developed by boradorish. This model is specifically optimized for reasoning tasks, having been fine-tuned on the sunny_reasoning dataset. It offers a context length of 32768 tokens, making it suitable for applications requiring enhanced logical inference and problem-solving capabilities.
Loading preview...
boradorish/qwen3-4b-new: Reasoning-Optimized Qwen3-4B
This model, developed by boradorish, is a fine-tuned variant of the Qwen/Qwen3-4B base model. It leverages the 4 billion parameter architecture of Qwen3 and is specifically enhanced for reasoning tasks.
Key Capabilities
- Reasoning Focus: Fine-tuned on the
sunny_reasoningdataset, indicating an optimization for logical inference and problem-solving. - Base Model: Built upon the robust Qwen3-4B architecture, providing a strong foundation for language understanding and generation.
- Context Window: Supports a substantial context length of 32768 tokens, allowing for processing and understanding longer inputs relevant to complex reasoning scenarios.
Training Details
The model was trained with a learning rate of 2e-05, a total batch size of 64, and for 3 epochs. It utilized a multi-GPU setup with AdamW optimizer and a cosine learning rate scheduler.
Good For
- Applications requiring enhanced logical reasoning.
- Tasks benefiting from a model specifically trained on reasoning datasets.
- Scenarios where a 4 billion parameter model with a large context window is advantageous for processing complex prompts.