Name: guangyangnlp/Qwen3-1.7B-SFT-science-2e-5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: guangyangnlp

Model Overview

The guangyangnlp/Qwen3-1.7B-SFT-science-2e-5 is a specialized language model derived from the Qwen3-1.7B architecture. It has been fine-tuned with a learning rate of 2e-05 over 3 epochs on the dolci_science_train dataset, aiming to enhance its capabilities in scientific contexts. The model maintains a 1.7 billion parameter count and supports a 32,768 token context length.

Key Characteristics

Base Model: Qwen/Qwen3-1.7B, a robust foundation for language understanding.
Fine-tuning Focus: Specifically trained on the dolci_science_train dataset, indicating an optimization for scientific text processing.
Training Parameters: Utilized AdamW_Torch_Fused optimizer, cosine learning rate scheduler, and a total batch size of 128.
Performance: Achieved a final validation loss of 0.7490 during training, suggesting improved performance on its target domain.

Potential Use Cases

This model is particularly suited for applications requiring a strong understanding and generation of scientific content. While specific intended uses and limitations are not detailed in the original README, its fine-tuning on a science-specific dataset suggests utility in areas such as:

Scientific Text Analysis: Summarizing research papers, extracting key information from scientific articles.
Question Answering: Answering queries related to scientific topics.
Content Generation: Drafting scientific explanations, reports, or educational materials.

Further evaluation would be needed to determine its precise strengths and limitations within various scientific sub-domains.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)