boradorish/qwen3-4b-base-prompt

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 22, 2026License:otherArchitecture:Transformer0.0K Warm

The boradorish/qwen3-4b-base-prompt is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B by boradorish. This model has been specifically fine-tuned on the 'sunny_reasoning' dataset, indicating an optimization for reasoning tasks. It features a 32768 token context length and demonstrates a low training loss of 0.0087 on its evaluation set, suggesting strong performance in its specialized domain.

Loading preview...

Model Overview

The boradorish/qwen3-4b-base-prompt is a 4 billion parameter language model, derived from the Qwen/Qwen3-4B architecture. It has been fine-tuned by boradorish using the 'sunny_reasoning' dataset, suggesting a specialization in tasks requiring logical inference or problem-solving capabilities. The model maintains a substantial context window of 32768 tokens, allowing it to process and generate longer sequences of text.

Key Training Details

  • Base Model: Qwen/Qwen3-4B
  • Fine-tuning Dataset: sunny_reasoning
  • Evaluation Loss: Achieved a low validation loss of 0.0087, indicating effective learning during the fine-tuning process.
  • Hyperparameters: Training involved a learning rate of 2e-05, a total batch size of 32 (with gradient accumulation), and 3 epochs using an AdamW optimizer and cosine learning rate scheduler.

Potential Use Cases

Given its fine-tuning on a reasoning-focused dataset, this model is likely well-suited for applications that require:

  • Logical Deduction: Tasks involving drawing conclusions from given premises.
  • Problem Solving: Scenarios where the model needs to analyze information and propose solutions.
  • Structured Question Answering: Answering questions that require more than simple fact retrieval, potentially involving multi-step reasoning.

Further information regarding specific intended uses and limitations would require additional details from the model developer.