Name: TheHassanSaud/ramzan_sft_gemma3_with_updated_templat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheHassanSaud

Model Overview

TheHassanSaud/ramzan_sft_gemma3_with_updated_templat is a 12 billion parameter language model, building upon the ramzanniaz331/gemma3-12b-2048-v3 base model. It has undergone supervised fine-tuning (SFT) using a comprehensive set of datasets, including ramzan_5k_batch_1, ramzan_5k_batch_2, ramzan_openhermes, ramzan_metamath, and ramzan_aya_urdu.

Key Training Details

The model was trained with the following hyperparameters:

Learning Rate: 5e-06
Batch Size: 1 (train), 8 (eval) with 8 gradient accumulation steps, resulting in a total effective batch size of 64.
Optimizer: ADAMW_TORCH_FUSED with default betas and epsilon.
Scheduler: Cosine learning rate scheduler with a 0.03 warmup ratio.
Epochs: 2.0

Intended Use Cases

Given its fine-tuning on a variety of datasets, this model is suitable for:

General text generation and understanding tasks.
Applications requiring knowledge from the specific datasets it was trained on, such as mathematical reasoning (from ramzan_metamath) and potentially multilingual understanding (from ramzan_aya_urdu).

Further details on specific intended uses, limitations, and comprehensive evaluation data are not provided in the current model card.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)