Name: FlyPig23/Qwen3-4B_Paper_Impact_dataset_SFT_1ep API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FlyPig23

Overview

This model, named FlyPig23/Qwen3-4B_Paper_Impact_dataset_SFT_1ep, is a 4 billion parameter language model derived from the Qwen3-4B-Instruct-2507 architecture. It has undergone a single epoch of supervised fine-tuning (SFT) using the paper_impact_dataset_train.

Training Details

The fine-tuning process utilized a learning rate of 2e-05, with a total training batch size of 64 across 4 GPUs. The training was performed for 1 epoch, resulting in an evaluation loss of 0.0880. Key hyperparameters included a cosine learning rate scheduler with a 0.1 warmup ratio and an AdamW optimizer.

Potential Use Cases

Given its fine-tuning on a 'paper_impact_dataset', this model is likely specialized for tasks related to analyzing or generating content concerning the impact of academic papers. While specific intended uses and limitations are not detailed in the original model card, its training data suggests applications in:

Academic Research Analysis: Potentially understanding or summarizing the influence of research papers.
Information Extraction: Extracting key insights related to paper impact from textual data.

Limitations

The model card explicitly states that more information is needed regarding its intended uses, limitations, and detailed training/evaluation data. Users should exercise caution and conduct further evaluation to determine its suitability for specific applications.

Overview

Overview

Training Details

Potential Use Cases

Limitations

Full Model Card (README)