Name: Columbia-NLP/gemma-2b-zephyr-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Columbia-NLP

Columbia-NLP/gemma-2b-zephyr-sft Overview

This model is a 2.5 billion parameter, English-centric, GPT-like language model developed by Columbia-NLP. It is a supervised fine-tuned (SFT) version of the original google/gemma-2b base model, trained using the deita-10k-v0-sft dataset. Key to its development was the careful selection of hyperparameters and masking of user tokens during training to enhance its SFT performance.

Key Capabilities

Enhanced Supervised Fine-Tuning: Optimized for tasks requiring strong performance from supervised fine-tuning.
Competitive Benchmarking: Achieves an average score of 48.75 on the OpenLLM Leaderboard, outperforming its base model and other Gemma-2b variants in several categories, including ARC (51.80), HellaSwag (72.63), MMLU (42.20), TruthfulQA (41.96), and GSM8k (20.09).
Solid MT-Bench Performance: Scores a total of 4.34 on MT-Bench, with notable performance in Humanities (6.25) and Roleplay (5.55).

Good For

Applications requiring a compact yet capable language model for general English text generation and understanding.
Use cases where a model with strong SFT performance and competitive benchmark results in the 2B parameter class is beneficial.
Researchers and developers looking for an efficient model derived from the Gemma family with specific fine-tuning optimizations.

Overview

Columbia-NLP/gemma-2b-zephyr-sft Overview

Key Capabilities

Good For

Full Model Card (README)