Hyeongwon/P2-split3_prob_Qwen3-1.7B-Base_0325-01

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:May 20, 2026Architecture:Transformer Warm

Hyeongwon/P2-split3_prob_Qwen3-1.7B-Base_0325-01 is a 2 billion parameter causal language model, fine-tuned from Hyeongwon/Qwen3-1.7B-Base using SFT. This model is designed for text generation tasks, leveraging its base architecture and fine-tuning for improved performance. It offers a context length of 32768 tokens, making it suitable for applications requiring processing of longer inputs.

Loading preview...

Model Overview

Hyeongwon/P2-split3_prob_Qwen3-1.7B-Base_0325-01 is a 2 billion parameter language model, fine-tuned from the Hyeongwon/Qwen3-1.7B-Base architecture. This model was developed by Hyeongwon and specifically trained using Supervised Fine-Tuning (SFT) with the TRL library. It supports a substantial context length of 32768 tokens, enabling it to handle extensive textual inputs for various generation tasks.

Key Capabilities

  • Text Generation: Optimized for generating coherent and contextually relevant text based on provided prompts.
  • Fine-tuned Performance: Benefits from SFT, which refines its base model's capabilities for specific applications.
  • Extended Context Window: Processes inputs up to 32768 tokens, suitable for tasks requiring long-range dependencies or detailed context.

Training Details

The model's training procedure involved Supervised Fine-Tuning (SFT) utilizing the TRL framework. The development environment included TRL version 0.25.1, Transformers 4.57.3, Pytorch 2.6.0, Datasets 3.6.0, and Tokenizers 0.22.2. Further details on the training run can be found on Weights & Biases.