Model Overview
Hyeongwon/P2-split3_prob_Qwen3-4B-Base_0312-01 is a 4 billion parameter language model derived from the Hyeongwon/Qwen3-4B-Base architecture. This model distinguishes itself through its fine-tuning process, which utilized the TRL (Transformer Reinforcement Learning) framework with a focus on Supervised Fine-Tuning (SFT). It is designed to handle tasks that benefit from this specific training methodology.
Key Capabilities
- Base Model Fine-tuning: Built upon Hyeongwon/Qwen3-4B-Base, inheriting its foundational language understanding.
- SFT Training: Leverages Supervised Fine-Tuning via TRL for potentially enhanced performance on specific, unstated probabilistic tasks.
- Context Length: Supports a substantial context window of 32768 tokens, allowing for processing and generation of longer text sequences.
Training Details
The model's training procedure involved SFT using the TRL framework. The development environment included:
- TRL: 0.25.1
- Transformers: 4.57.3
- Pytorch: 2.6.0
- Datasets: 3.6.0
- Tokenizers: 0.22.2
Further details on the training run can be visualized on Weights & Biases.
Good For
This model is suitable for developers looking for a 4B parameter model with a large context window that has undergone specific SFT training, particularly for applications where the base Qwen3-4B-Base model's capabilities are enhanced by this fine-tuning approach.