Hyeongwon/P19-split3-prob-9x-bs512-lr2e5-zero3-ep3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 7, 2026Architecture:Transformer Warm

Hyeongwon/P19-split3-prob-9x-bs512-lr2e5-zero3-ep3 is a 4 billion parameter language model fine-tuned from Hyeongwon/Qwen3-4B-Base. This model was trained using Supervised Fine-Tuning (SFT) with the TRL library. It is designed for general text generation tasks, building upon the base capabilities of the Qwen3 architecture. The model has a context length of 32768 tokens.

Loading preview...

Model Overview

Hyeongwon/P19-split3-prob-9x-bs512-lr2e5-zero3-ep3 is a 4 billion parameter language model that has been fine-tuned from the Hyeongwon/Qwen3-4B-Base architecture. This model leverages the TRL library for its training process, specifically utilizing Supervised Fine-Tuning (SFT).

Key Capabilities

  • Text Generation: Designed for general text generation tasks, building on the foundational capabilities of the Qwen3-4B-Base model.
  • Fine-tuned Performance: Benefits from a specific fine-tuning procedure, which can be visualized via its Weights & Biases run.
  • Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended outputs.

Training Details

The model was trained using Supervised Fine-Tuning (SFT). The training procedure utilized several key frameworks:

  • TRL: 0.25.1
  • Transformers: 4.57.3
  • Pytorch: 2.9.1
  • Datasets: 3.6.0
  • Tokenizers: 0.22.2

Usage

Developers can quickly integrate and use this model for text generation tasks using the Hugging Face transformers library, as demonstrated in the provided quick start example.