Name: FlyPig23/Qwen3-4B_Paper_Impact_media_SFT_1ep API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FlyPig23

Overview

FlyPig23/Qwen3-4B_Paper_Impact_media_SFT_1ep is a 4 billion parameter language model built upon the Qwen3 architecture. It is a fine-tuned variant of the base model Qwen/Qwen3-4B-Instruct-2507, specifically adapted through supervised fine-tuning (SFT) for one epoch.

Key Characteristics

Base Model: Qwen3-4B-Instruct-2507
Parameter Count: 4 billion parameters
Context Length: 32768 tokens
Fine-tuning Dataset: paper_impact_media_train
Training Performance: Achieved a loss of 0.0574 on the evaluation set during training.
Training Hyperparameters: Utilized a learning rate of 2e-05, a total batch size of 64 (with gradient accumulation), and the AdamW optimizer with a cosine learning rate scheduler.

Intended Use Cases

This model is specifically fine-tuned on a dataset related to 'paper impact media'. While specific details on its intended uses and limitations are not extensively provided in the README, its training data suggests potential applications in:

Analyzing the impact or reception of academic papers or media content.
Generating summaries or insights related to research dissemination.
Tasks requiring understanding or creation of content within the domain of academic or media influence.

Limitations

The README explicitly states that more information is needed regarding the model's intended uses and limitations. Users should exercise caution and conduct thorough evaluations for specific applications, as the full scope of its capabilities and potential biases is not yet detailed.

Overview

Overview

Key Characteristics

Intended Use Cases

Limitations

Full Model Card (README)