Hyeongwon/P12-split5-one-sided-bs64-lr2e5-zero3-ep3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 22, 2026Architecture:Transformer Warm

The Hyeongwon/P12-split5-one-sided-bs64-lr2e5-zero3-ep3 is a 4 billion parameter language model, fine-tuned from Hyeongwon/Qwen3-4B-Base with a 32768 token context length. This model was trained using Supervised Fine-Tuning (SFT) via the TRL framework. It is designed for general text generation tasks, leveraging its base architecture for diverse language understanding and production.

Loading preview...

Model Overview

Hyeongwon/P12-split5-one-sided-bs64-lr2e5-zero3-ep3 is a 4 billion parameter language model, building upon the Hyeongwon/Qwen3-4B-Base architecture. It features a substantial context length of 32768 tokens, enabling it to process and generate longer sequences of text.

Key Capabilities

  • Text Generation: Capable of generating coherent and contextually relevant text based on given prompts.
  • Supervised Fine-Tuning (SFT): The model was trained using SFT, indicating a focus on learning specific patterns and behaviors from a curated dataset.
  • TRL Framework: Developed with the TRL library, a framework for Transformer Reinforcement Learning, though this specific model utilized SFT.

Good For

  • General-purpose text generation: Suitable for various applications requiring text completion, question answering, or creative writing.
  • Developers familiar with Hugging Face ecosystem: Easy integration with transformers and pipeline for quick deployment and experimentation.

Training Details

The model's training procedure involved Supervised Fine-Tuning. It leveraged specific versions of popular machine learning frameworks:

  • TRL: 0.25.1
  • Transformers: 4.57.3
  • PyTorch: 2.9.1
  • Datasets: 3.6.0
  • Tokenizers: 0.22.2