Name: graf/Qwen3-1.7B-SFT-science-2e-5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: graf

Model Overview

This model, graf/Qwen3-1.7B-SFT-science-2e-5, is a specialized fine-tuned variant of the Qwen3-1.7B base model, developed by Qwen. It features approximately 1.7 billion parameters and supports a 32,768 token context length, making it suitable for processing substantial amounts of text.

Key Specialization

The model has undergone supervised fine-tuning (SFT) using the dolci_science_train dataset. This targeted training suggests an enhanced capability for tasks within scientific domains. The fine-tuning process aimed to adapt the general-purpose Qwen3-1.7B model to better understand and generate content relevant to scientific inquiry.

Training Details

Training was conducted with a learning rate of 2e-05, a batch size of 2 (accumulated to 128), and ran for 3.0 epochs. The training procedure utilized the AdamW optimizer with a cosine learning rate scheduler. Evaluation metrics show a final validation loss of 0.7464, indicating successful adaptation to the scientific dataset.

Potential Use Cases

Given its fine-tuning on a scientific dataset, this model is likely well-suited for applications requiring:

Scientific text analysis: Understanding and summarizing research papers, articles, or technical documents.
Information extraction: Identifying key concepts, entities, or relationships within scientific literature.
Scientific content generation: Assisting in drafting scientific explanations, hypotheses, or reports.

Limitations

As indicated in the original model card, further information regarding intended uses, limitations, and comprehensive training/evaluation data is needed for a complete understanding of its scope and performance boundaries.

Overview

Model Overview

Key Specialization

Training Details

Potential Use Cases

Limitations

Full Model Card (README)