Hyeongwon/P12-split2-one-sided-bs64-lr2e5-zero3-ep3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 22, 2026Architecture:Transformer Warm

Hyeongwon/P12-split2-one-sided-bs64-lr2e5-zero3-ep3 is a 4 billion parameter language model fine-tuned from Hyeongwon/Qwen3-4B-Base. This model was trained using Supervised Fine-Tuning (SFT) with the TRL library. It is designed for general text generation tasks, building upon the capabilities of its Qwen3-4B-Base foundation. The model has a context length of 32768 tokens.

Loading preview...

Model Overview

Hyeongwon/P12-split2-one-sided-bs64-lr2e5-zero3-ep3 is a 4 billion parameter language model that has been fine-tuned from the Hyeongwon/Qwen3-4B-Base architecture. This model leverages the TRL library for its training process, specifically employing Supervised Fine-Tuning (SFT).

Key Characteristics

  • Base Model: Fine-tuned from Hyeongwon/Qwen3-4B-Base, inheriting its foundational capabilities.
  • Training Method: Utilizes Supervised Fine-Tuning (SFT) for specialized performance.
  • Parameter Count: Features 4 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling processing of longer inputs and generating coherent, extended responses.

Intended Use Cases

This model is suitable for a variety of text generation tasks, benefiting from its SFT training. Developers can integrate it into applications requiring:

  • General-purpose text generation.
  • Question answering based on provided context.
  • Conversational AI where a large context window is advantageous.

Its training with TRL and a significant context length make it a robust option for applications demanding nuanced and context-aware language understanding and generation.