Name: Hyeongwon/PH_prob_sft_FC_swap_labewise_data_oversampling_bf16_lr0.00002_context_12k-Qwen3-8B-Base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Hyeongwon

Model Overview

This model, PH_prob_sft_FC_swap_labewise_data_oversampling_bf16_lr0.00002_context_12k-Qwen3-8B-Base, is an 8 billion parameter language model developed by Hyeongwon. It is a fine-tuned version of the ChuGyouk/Qwen3-8B-Base model, utilizing Supervised Fine-Tuning (SFT) techniques implemented with the TRL library.

Key Characteristics

Base Model: Fine-tuned from ChuGyouk/Qwen3-8B-Base.
Training Method: Employs Supervised Fine-Tuning (SFT) for specialized performance.
Context Length: Supports a substantial context window of 32,768 tokens, enabling processing of longer inputs and generating more coherent, extended responses.
Training Details: The training procedure involved specific configurations such as bf16 precision, a learning rate of 0.00002, and data oversampling, suggesting an effort to enhance model stability and performance on particular data distributions.

Use Cases

This model is suitable for a variety of text generation tasks where a robust, fine-tuned language model with a large context window is beneficial. Potential applications include:

Conversational AI: Generating detailed and contextually relevant responses in chatbots or virtual assistants.
Content Creation: Assisting with writing longer-form content, articles, or creative narratives.
Question Answering: Providing comprehensive answers by processing extensive background information.

Technical Details

The model was trained using the following framework versions:

TRL: 0.25.1
Transformers: 4.57.3
Pytorch: 2.6.0
Datasets: 3.6.0
Tokenizers: 0.22.2

Overview

Model Overview

Key Characteristics

Use Cases

Technical Details

Full Model Card (README)