Hyeongwon/P9-split1_prob_Qwen3-4B-Base_0317-01

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Warm

Hyeongwon/P9-split1_prob_Qwen3-4B-Base_0317-01 is a 4 billion parameter language model, fine-tuned from Hyeongwon/Qwen3-4B-Base using SFT with TRL. This model is designed for general text generation tasks, leveraging its 32K context length for processing longer inputs. It is suitable for applications requiring a moderately sized, fine-tuned base model for various natural language processing tasks.

Loading preview...

Model Overview

Hyeongwon/P9-split1_prob_Qwen3-4B-Base_0317-01 is a 4 billion parameter language model, fine-tuned by Hyeongwon from the base model Hyeongwon/Qwen3-4B-Base. This model was developed using Supervised Fine-Tuning (SFT) techniques, specifically leveraging the TRL library for its training procedure.

Key Capabilities

  • Text Generation: Optimized for generating coherent and contextually relevant text based on user prompts.
  • Fine-tuned Performance: Benefits from SFT, suggesting improved performance on tasks aligned with its training data compared to its base model.
  • Context Length: Features a substantial context window of 32,768 tokens, enabling it to process and generate longer sequences of text.

Training Details

The model's training process utilized SFT, with specific framework versions including TRL 0.25.1, Transformers 4.57.3, Pytorch 2.6.0, Datasets 3.6.0, and Tokenizers 0.22.2. The training run can be visualized via Weights & Biases.

Good For

  • General Text Generation: Suitable for a wide range of applications requiring text completion, question answering, or creative writing.
  • Research and Development: Provides a fine-tuned 4B parameter model for further experimentation or as a base for domain-specific adaptations.
  • Applications requiring moderate context: Its 32K context length makes it effective for tasks where understanding and generating longer passages are crucial.