Name: Hyeongwon/joint_mimic3_p12_p19_split1_bs192_lr2e5_ep3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Hyeongwon

Model Overview

This model, joint_mimic3_p12_p19_split1_bs192_lr2e5_ep3, is a 4 billion parameter language model developed by Hyeongwon. It is a fine-tuned version of the Hyeongwon/Qwen3-4B-Base model, leveraging the Qwen3 architecture. The training process utilized Supervised Fine-Tuning (SFT) with the TRL (Transformer Reinforcement Learning) framework.

Key Capabilities

Text Generation: Capable of generating coherent and contextually relevant text based on provided prompts.
Fine-tuned Performance: Benefits from SFT to enhance its performance on specific tasks or domains, building upon its base model's general language understanding.
Qwen3 Architecture: Inherits the robust capabilities of the Qwen3 model family, known for its efficiency and performance in various language tasks.

Training Details

The model was trained using the TRL framework, specifically version 0.25.1. Other framework versions used include Transformers 4.57.3, Pytorch 2.9.1, Datasets 3.6.0, and Tokenizers 0.22.2. The training run details are available for visualization on Weights & Biases.

Good For

General-purpose text generation applications.
Further fine-tuning for specialized downstream tasks.
Research and development exploring SFT techniques on Qwen3-based models.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)