Name: graf/Qwen3-4B-SFT-science-1e-5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: graf

Overview

This model, graf/Qwen3-4B-SFT-science-1e-5, is a specialized version of the 4 billion parameter Qwen3-4B base model. It has been fine-tuned by graf using the dolci_science_train dataset, indicating a focus on scientific domain understanding and generation. The training process involved a learning rate of 1e-05 over 3 epochs, utilizing a total batch size of 128 across 4 GPUs.

Key Characteristics

Base Model: Qwen3-4B architecture.
Parameter Count: 4 billion parameters.
Context Length: Supports a context window of 32768 tokens.
Fine-tuning Focus: Specialized for science-related tasks through training on the dolci_science_train dataset.
Training Performance: Achieved a final validation loss of 0.6816 during fine-tuning.

Intended Use Cases

This model is best suited for applications that require processing or generating content within scientific fields. Its fine-tuning on a science-specific dataset suggests improved performance for tasks such as:

Answering scientific questions.
Summarizing scientific texts.
Generating scientific explanations or reports.
Assisting with scientific research-related queries.

Overview

Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)