Name: tzchen07/Gemma2-2B-SFT-X8c-2ep API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tzchen07

Model Overview

This model, tzchen07/Gemma2-2B-SFT-X8c-2ep, is a 2.6 billion parameter language model derived from the Gemma 2 architecture. It has been specifically fine-tuned from unsloth/gemma-2-2b-it using a supervised fine-tuning (SFT) approach. The training utilized the v1_6_plus_v1_8_plus_v1_6c dataset, enhancing its capabilities for general language tasks.

Key Training Details

The fine-tuning process involved specific hyperparameters to optimize performance:

Learning Rate: 5e-06
Batch Size: A train_batch_size of 4 and eval_batch_size of 8, with a gradient_accumulation_steps of 16, resulting in a total_train_batch_size of 64.
Optimizer: ADAMW_TORCH with default betas and epsilon.
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1 over 2 epochs.

Intended Use Cases

While specific intended uses and limitations require further information, as a fine-tuned Gemma 2-2B model, it is generally suitable for a range of natural language processing applications where a compact yet capable model is beneficial. Its training on a specific dataset suggests potential strengths in areas covered by that data, making it a candidate for tasks requiring nuanced understanding and generation based on its fine-tuning.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)