Name: secmlr/DS-Noisy-N_DS-Clean-N_DS-OSS-N_QWQ-OSS-N_QWQ-Clean-N_QWQ-Noisy-N_Qwen2.5-7B-Instruct_sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: secmlr

Model Overview

This model, developed by secmlr, is a fine-tuned variant of the Qwen2.5-7B-Instruct base model, featuring 7.6 billion parameters and a context length of 131,072 tokens. It has undergone supervised fine-tuning (SFT) using a diverse set of datasets, including DS-Noisy-N, DS-Clean-N, DS-OSS-N, QWQ-OSS-N, QWQ-Clean-N, and QWQ-Noisy-N.

Training Details

The fine-tuning process utilized specific hyperparameters to optimize performance:

Learning Rate: 1e-05
Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
Batch Size: A total training batch size of 24 (1 per device with 12 gradient accumulation steps) and an evaluation batch size of 16 (8 per device).
Epochs: 3.0
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.

Intended Use

While specific intended uses and limitations are not detailed in the provided README, its instruction-tuned nature and diverse training datasets suggest suitability for a wide range of general-purpose natural language understanding and generation tasks. Developers should consider its base model's capabilities and the fine-tuning data's characteristics when evaluating its fit for specific applications.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)