Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-03-bs128

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Warm

Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-03-bs128 is an 8 billion parameter language model, fine-tuned by Hyeongwon from the ChuGyouk/Qwen3-8B-Base architecture. This model was trained using Supervised Fine-Tuning (SFT) with the TRL framework, building upon a 32768 token context length. It is designed for general text generation tasks, leveraging its base Qwen3 architecture for broad applicability.

Loading preview...

Model Overview

Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-03-bs128 is an 8 billion parameter language model, fine-tuned from the ChuGyouk/Qwen3-8B-Base model. This iteration was developed by Hyeongwon and utilizes a 32768 token context length, making it suitable for processing moderately long inputs.

Training Details

The model underwent Supervised Fine-Tuning (SFT) using the TRL library. The training process leveraged specific versions of key frameworks:

  • TRL: 0.25.1
  • Transformers: 4.57.3
  • Pytorch: 2.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.22.2

Training progress and metrics were tracked using Weights & Biases, as indicated by the provided link to the run lgt2bbk0 within the aitrics-class-imbalanced-rl-P12 project.

Key Capabilities

  • Text Generation: Capable of generating coherent and contextually relevant text based on user prompts.
  • Fine-tuned Performance: Benefits from SFT, which typically enhances performance on specific tasks or improves adherence to instructions compared to base models.

Intended Use Cases

This model is suitable for a variety of text generation applications where a fine-tuned 8B parameter model with a substantial context window is beneficial. Developers can integrate it using the Hugging Face pipeline for quick deployment in tasks such as:

  • Answering open-ended questions
  • Creative writing prompts
  • General conversational AI
  • Content creation requiring moderate context understanding