Name: ale-bay/zephyr-2b-gemma-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ale-bay

Overview

ale-bay/zephyr-2b-gemma-sft is a 2.6 billion parameter language model derived from Google's Gemma-2B architecture. It has been instruction-tuned on the HuggingFaceH4/deita-10k-v0-sft dataset, aiming to enhance its performance on various instruction-following tasks. The model was trained over 3 epochs, achieving a final validation loss of 1.0529.

Key Capabilities

Instruction Following: Fine-tuned on a supervised instruction dataset to improve response generation based on given prompts.
General Language Generation: Suitable for a range of text generation tasks due to its foundational Gemma architecture and instruction tuning.

Training Details

The model utilized a learning rate of 2e-05, a total batch size of 128, and an Adam optimizer. Training was conducted across 8 GPUs with a cosine learning rate scheduler and a warmup ratio of 0.1.

Intended Use Cases

Prototyping: Can serve as a base for further fine-tuning on more specific datasets.
Research: Useful for exploring the effects of instruction tuning on Gemma-2B with the deita-10k-v0-sft dataset.

Overview

Overview

Key Capabilities

Training Details

Intended Use Cases

Full Model Card (README)