Name: FlyPig23/Qwen3-4B_Paper_Impact_SFT_1ep API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FlyPig23

Model Overview

FlyPig23/Qwen3-4B_Paper_Impact_SFT_1ep is a specialized 4 billion parameter language model derived from the Qwen/Qwen3-4B-Instruct-2507 architecture. It has been fine-tuned for a single epoch on the paper_impact_sft_train dataset, demonstrating a low validation loss of 0.0623.

Key Characteristics

Base Model: Qwen3-4B-Instruct-2507, a robust foundation for instruction-following tasks.
Parameter Count: 4 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens.
Training Focus: Fine-tuned specifically on a dataset related to "paper impact," suggesting an optimization for tasks within this domain.
Training Hyperparameters: Utilized a learning rate of 2e-05, a total training batch size of 64, and a cosine learning rate scheduler with a 0.1 warmup ratio.

Potential Use Cases

Academic Research Analysis: Potentially useful for tasks involving the analysis or summarization of research paper impact.
Specialized SFT Tasks: Suitable for applications requiring a model fine-tuned on specific supervised fine-tuning (SFT) datasets, particularly those similar to paper_impact_sft_train.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)