hmdmahdavi/olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 25, 2026Architecture:Transformer Warm

The hmdmahdavi/olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation model is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507. Developed by hmdmahdavi, this model leverages a 32768 token context length and was trained using the TRL framework. It is designed for general text generation tasks, building upon its Qwen3 base with specific fine-tuning for improved performance.

Loading preview...

Model Overview

This model, olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation, is a 4 billion parameter language model developed by hmdmahdavi. It is a fine-tuned variant of the Qwen3-4B-Instruct-2507 base model, utilizing a substantial 32768 token context window. The fine-tuning process was conducted using the TRL (Transformer Reinforcement Learning) framework, indicating a focus on enhancing its conversational and instruction-following capabilities.

Key Capabilities

  • Instruction Following: As a fine-tuned instruction model, it is designed to respond effectively to user prompts and questions.
  • Text Generation: Capable of generating coherent and contextually relevant text based on input.
  • Qwen3 Architecture: Benefits from the robust architecture of the Qwen3 series, known for its general language understanding.

Training Details

The model underwent Supervised Fine-Tuning (SFT), a common method for adapting pre-trained language models to specific tasks or instruction sets. The training process utilized TRL version 0.12.0, Transformers 4.57.6, Pytorch 2.5.1, Datasets 4.5.0, and Tokenizers 0.22.2. Further details on the training run can be visualized via its Weights & Biases project.