Name: graf/Qwen3-1.7B-SFT-medical-2e-5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: graf

Model Overview

The graf/Qwen3-1.7B-SFT-medical-2e-5 is a specialized language model built upon the Qwen3-1.7B architecture. This model has undergone supervised fine-tuning (SFT) specifically for medical applications, utilizing the medical_o1_train dataset.

Key Characteristics

Base Model: Qwen/Qwen3-1.7B, a 1.7 billion parameter model.
Domain Specialization: Fine-tuned on a medical dataset (medical_o1_train) to enhance performance in healthcare-related tasks.
Performance: Achieved a validation loss of 1.4089 during training, indicating its focused optimization for the medical domain.

Training Details

The model was trained with a learning rate of 2e-05, a batch size of 16, and a gradient accumulation of 8, resulting in an effective total batch size of 128. It utilized the ADAMW_TORCH_FUSED optimizer and a cosine learning rate scheduler over 3 epochs.

Intended Use Cases

This model is designed for applications requiring a deep understanding and generation of medical-related text. It is particularly suitable for tasks such as:

Medical text analysis
Information extraction from clinical notes
Supporting medical question-answering systems

Limitations

As with any specialized model, its performance outside the medical domain may be limited. Further information regarding specific intended uses and limitations is needed for a comprehensive understanding.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Limitations

Full Model Card (README)