Name: Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-05-bs128-epoch6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Hyeongwon

Model Overview

Hyeongwon/P2-split2_prob_Qwen3-8B-Base_0325-05-bs128-epoch6 is an 8 billion parameter language model, derived from the Qwen3-8B-Base architecture. It has undergone Supervised Fine-Tuning (SFT) using the TRL framework, specifically version 0.25.1. The training process utilized a batch size of 128 over 6 epochs, as indicated by its naming convention.

Key Capabilities

Text Generation: Capable of generating coherent and contextually relevant text based on provided prompts.
Fine-tuned Performance: Benefits from SFT, which typically enhances performance on specific tasks or improves instruction following compared to base models.
Extended Context Window: Supports a context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Training Details

The model was fine-tuned from ChuGyouk/Qwen3-8B-Base using the TRL library. The training run details are available for visualization on Weights & Biases. The framework versions used include Transformers 4.57.3, Pytorch 2.6.0, Datasets 3.6.0, and Tokenizers 0.22.2.

Good For

General-purpose text generation tasks.
Applications requiring a model with an 8 billion parameter count and a substantial context window.
Further experimentation or fine-tuning for specific downstream applications.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)