Hyeongwon/PH_prob_mini_Qwen3-8B-Base_0305-01

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 5, 2026Architecture:Transformer Cold

PH_prob_mini_Qwen3-8B-Base_0305-01 is an 8 billion parameter language model developed by Hyeongwon, fine-tuned from ChuGyouk/Qwen3-8B-Base. This model was trained using Supervised Fine-Tuning (SFT) with the TRL framework, offering a 32768 token context length. It is designed for general text generation tasks, building upon the Qwen3 architecture.

Loading preview...

Model Overview

PH_prob_mini_Qwen3-8B-Base_0305-01 is an 8 billion parameter language model developed by Hyeongwon. It is a fine-tuned version of the ChuGyouk/Qwen3-8B-Base model, leveraging the Qwen3 architecture. The model was trained using Supervised Fine-Tuning (SFT) with the TRL framework, indicating a focus on instruction-following or specific task performance.

Key Capabilities

  • Text Generation: Capable of generating coherent and contextually relevant text based on provided prompts.
  • Instruction Following: Fine-tuned with SFT, suggesting improved performance on instruction-based tasks.
  • Large Context Window: Supports a context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Training Details

The model's training procedure involved Supervised Fine-Tuning (SFT) using the TRL library. The development utilized specific versions of key frameworks:

  • TRL: 0.25.1
  • Transformers: 4.57.3
  • Pytorch: 2.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.22.2

Good For

  • General-purpose text generation tasks.
  • Applications requiring a model with a substantial context window.
  • Further fine-tuning for specialized downstream applications.