guangyangnlp/Qwen3-1.7B-SFT-medical-2e-5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Feb 22, 2026License:otherArchitecture:Transformer Warm

The guangyangnlp/Qwen3-1.7B-SFT-medical-2e-5 model is a fine-tuned version of the Qwen3-1.7B architecture, specifically optimized for medical applications. This 1.7 billion parameter model was trained on the medical_o1_train dataset, demonstrating a validation loss of 1.4089. It is designed to enhance performance in medical-related natural language processing tasks, leveraging its specialized fine-tuning for domain-specific understanding.

Loading preview...

Model Overview

The guangyangnlp/Qwen3-1.7B-SFT-medical-2e-5 is a specialized language model derived from the Qwen3-1.7B base architecture. This model has undergone supervised fine-tuning (SFT) using the medical_o1_train dataset, indicating a focus on medical domain applications.

Key Characteristics

  • Base Model: Qwen3-1.7B, a 1.7 billion parameter model.
  • Domain Specialization: Fine-tuned on a medical dataset (medical_o1_train) to improve performance in healthcare-related NLP tasks.
  • Training Performance: Achieved a validation loss of 1.4089 during training, with a learning rate of 2e-05 and a total batch size of 128.
  • Frameworks: Trained using Transformers 5.0.0, Pytorch 2.10.0+cu128, Datasets 4.0.0, and Tokenizers 0.22.2.

Intended Use Cases

This model is particularly suited for applications requiring an understanding of medical text. While specific intended uses and limitations are not detailed in the original model card, its fine-tuning on a medical dataset suggests utility in areas such as:

  • Medical text analysis
  • Information extraction from clinical notes
  • Medical question answering (with further adaptation)

Users should be aware that the model card indicates a need for more information regarding its full capabilities, limitations, and specific training data details.