Hyeongwon/P2-split1_prob_Qwen3-4B-Base_0312-01

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 12, 2026Architecture:Transformer Warm

Hyeongwon/P2-split1_prob_Qwen3-4B-Base_0312-01 is a 4 billion parameter causal language model developed by Hyeongwon, fine-tuned from the Qwen3-4B-Base architecture. This model was trained using Supervised Fine-Tuning (SFT) with the TRL framework. It is designed for general text generation tasks, building upon its base model's capabilities with specific fine-tuning.

Loading preview...

Overview

This model, P2-split1_prob_Qwen3-4B-Base_0312-01, is a 4 billion parameter language model developed by Hyeongwon. It is a fine-tuned version of the Hyeongwon/Qwen3-4B-Base model, leveraging the Qwen3 architecture. The fine-tuning process was conducted using the TRL library, specifically employing Supervised Fine-Tuning (SFT).

Key Characteristics

  • Base Model: Fine-tuned from Hyeongwon/Qwen3-4B-Base.
  • Training Framework: Utilizes the TRL (Transformer Reinforcement Learning) library for fine-tuning.
  • Training Method: Trained using Supervised Fine-Tuning (SFT).
  • Parameter Count: Features 4 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context length of 32768 tokens.

Usage

This model is suitable for various text generation tasks, building on the foundational capabilities of the Qwen3-4B-Base model. Developers can integrate it into their applications using the transformers library, as demonstrated in the quick start example provided in the original model card.