Name: SeongryongJung/powerplantbench-qwen3-4b-full-sft-cot API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SeongryongJung

Model Overview

SeongryongJung/powerplantbench-qwen3-4b-full-sft-cot is a 4 billion parameter language model derived from the Qwen3-4B architecture. This model has undergone supervised fine-tuning (SFT) with a Chain-of-Thought (CoT) approach, utilizing the powerplantbench_jy_sft_cot dataset.

Key Characteristics

Base Model: Qwen/Qwen3-4B, a robust foundation for language understanding and generation.
Fine-tuning: Specialized SFT with CoT on a domain-specific dataset, suggesting enhanced performance for tasks within that domain.
Parameter Count: 4 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a context length of 32768 tokens, enabling processing of extensive inputs.

Training Details

The model was trained with a learning rate of 1e-05, using the AdamW_Torch optimizer. Training involved 3 epochs with a total batch size of 16 across 2 GPUs, employing a cosine learning rate scheduler with 0.1 warmup steps. This configuration aims to optimize the model's ability to reason and generate coherent responses within its specialized domain.

Potential Use Cases

Domain-Specific Applications: Ideal for tasks requiring deep understanding and generation related to power plant operations or similar industrial contexts, given its specialized training data.
Reasoning Tasks: The Chain-of-Thought fine-tuning suggests improved capabilities in complex reasoning and problem-solving within its domain.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)