Hyeongwon/P2-split2_weighted_answer_Qwen3-4B-Base_lr2e5_ep3_as1
Hyeongwon/P2-split2_weighted_answer_Qwen3-4B-Base_lr2e5_ep3_as1 is a 4 billion parameter language model fine-tuned from Hyeongwon/Qwen3-4B-Base. This model was trained using Supervised Fine-Tuning (SFT) with the TRL framework. It is designed to generate answers to open-ended questions, as demonstrated by its quick start example focusing on hypothetical scenarios. The model leverages a 32768 token context length for processing longer inputs.
Loading preview...
Model Overview
Hyeongwon/P2-split2_weighted_answer_Qwen3-4B-Base_lr2e5_ep3_as1 is a 4 billion parameter language model, fine-tuned from the base model Hyeongwon/Qwen3-4B-Base. This model was developed using the TRL (Transformer Reinforcement Learning) framework, specifically through a Supervised Fine-Tuning (SFT) process.
Key Capabilities
- Question Answering: The model is fine-tuned to generate responses to open-ended questions, as illustrated by its quick start example which involves answering a hypothetical scenario.
- Base Model: Built upon the Qwen3-4B-Base architecture, providing a robust foundation for language understanding and generation.
- Context Length: Supports a substantial context window of 32768 tokens, enabling it to process and generate text based on longer input prompts.
Training Details
The model's training involved Supervised Fine-Tuning (SFT) using the TRL library. The development environment included TRL version 0.25.1, Transformers 4.57.3, Pytorch 2.9.1, Datasets 3.6.0, and Tokenizers 0.22.2. Further details on the training procedure can be visualized via the provided Weights & Biases link.