Hyeongwon/P2-split2_prob_rg_v2_Qwen3-4B-Base-0415

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 15, 2026Architecture:Transformer Cold

Hyeongwon/P2-split2_prob_rg_v2_Qwen3-4B-Base-0415 is a 4 billion parameter language model developed by Hyeongwon, fine-tuned from the Qwen3-4B-Base architecture. This model was trained using Supervised Fine-Tuning (SFT) with the TRL framework, specializing in text generation tasks. It features a notable context length of 32768 tokens, making it suitable for applications requiring extensive contextual understanding.

Loading preview...

Model Overview

Hyeongwon/P2-split2_prob_rg_v2_Qwen3-4B-Base-0415 is a 4 billion parameter language model, fine-tuned by Hyeongwon from the base Qwen3-4B-Base architecture. This model leverages a substantial context length of 32768 tokens, enhancing its ability to process and generate coherent text over long inputs.

Key Capabilities

  • Text Generation: Optimized for generating responses based on given prompts, as demonstrated by its quick start example for conversational questions.
  • Fine-tuned Performance: Developed through Supervised Fine-Tuning (SFT) using the TRL library, indicating a focus on specific task performance rather than broad pre-training.
  • Extensive Context Window: Benefits from a 32K context length, allowing for more detailed and contextually aware outputs.

Training Details

The model was trained using SFT, with the process visualized via Weights & Biases. The training utilized specific versions of key frameworks:

  • TRL: 0.25.1
  • Transformers: 4.57.3
  • Pytorch: 2.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.22.2

Recommended Use Cases

This model is well-suited for applications requiring robust text generation, particularly where the ability to handle long input contexts is beneficial. Its fine-tuned nature suggests potential for specialized conversational agents or content creation tools.