Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-02-lr1e-5

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Cold

Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-02-lr1e-5 is an 8 billion parameter language model fine-tuned from ChuGyouk/Qwen3-8B-Base using Supervised Fine-Tuning (SFT) with the TRL library. This model is designed for general text generation tasks, leveraging its 32768 token context length for processing longer inputs. Its training methodology focuses on adapting the base Qwen3-8B architecture for specific probabilistic text generation applications.

Loading preview...

Model Overview

Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-02-lr1e-5 is an 8 billion parameter language model, fine-tuned from the ChuGyouk/Qwen3-8B-Base architecture. This model was developed using Supervised Fine-Tuning (SFT) techniques, implemented with the TRL (Transformer Reinforcement Learning) library.

Key Capabilities

  • Text Generation: Optimized for generating coherent and contextually relevant text based on provided prompts.
  • Base Model Enhancement: Builds upon the capabilities of the Qwen3-8B-Base model, adapting it through SFT for specific applications.
  • Context Handling: Features a substantial context window of 32768 tokens, allowing it to process and generate longer sequences of text.

Training Details

The model underwent a Supervised Fine-Tuning (SFT) process. The training utilized specific versions of key frameworks:

  • TRL: 0.25.1
  • Transformers: 4.57.3
  • Pytorch: 2.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.22.2

This fine-tuning approach aims to specialize the base model for improved performance in its intended use cases.