Hyeongwon/P2-split2_prob_Qwen3-14B-Base_0405

TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Cold

Hyeongwon/P2-split2_prob_Qwen3-14B-Base_0405 is a 14 billion parameter language model fine-tuned from Qwen/Qwen3-14B-Base. Developed by Hyeongwon, this model leverages the TRL framework for its training procedure. It is designed for general text generation tasks, building upon the foundational capabilities of the Qwen3-14B-Base architecture with a 32768 token context length.

Loading preview...

Overview

This model, P2-split2_prob_Qwen3-14B-Base_0405, is a 14 billion parameter language model developed by Hyeongwon. It is a fine-tuned version of the Qwen/Qwen3-14B-Base model, utilizing the TRL (Transformer Reinforcement Learning) framework for its training process. The model was specifically trained using Supervised Fine-Tuning (SFT).

Key Capabilities

  • Text Generation: Capable of generating coherent and contextually relevant text based on given prompts.
  • Foundation Model: Builds upon the robust architecture and pre-training of the Qwen3-14B-Base model.
  • Fine-tuned Performance: Benefits from targeted SFT to potentially enhance performance on specific tasks or response styles.

Training Details

The model's training procedure involved Supervised Fine-Tuning (SFT) using the TRL library. The development environment included TRL version 0.25.1, Transformers 4.57.3, Pytorch 2.6.0, Datasets 3.6.0, and Tokenizers 0.22.2. Further details on the training run can be visualized via Weights & Biases.

Use Cases

This model is suitable for various text generation applications where a 14 billion parameter model with a 32768 token context window is appropriate, leveraging its fine-tuned capabilities for improved output quality.