Name: Columbia-NLP/LION-Gemma-2b-dpo-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Columbia-NLP

Model Overview

Columbia-NLP's LION-Gemma-2b-dpo-v1.0 is a 2.5 billion parameter language model built upon the Gemma-2b architecture. It is a product of the LION-series training pipeline, which emphasizes an empirically optimized three-stage process: Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and online preference learning (online DPO). This specific version is fine-tuned using SFT and DPO from the LION pipeline, building on the Columbia-NLP/LION-Gemma-2b-sft-v1.0 model.

Key Capabilities & Performance

The LION pipeline incorporates techniques such as sequence packing and loss masking in SFT, and increasing preference dataset size in DPO, which collectively enhance model performance. Benchmarks indicate that LION-Gemma-2b-dpo-v1.0 achieves strong results for its size, outperforming the official Gemma-2b-it model and other 2B parameter models on metrics like Arena-Hard (4.6), AlpacaEval-2 (8.75), MT-Bench (6.58), and OpenLLM (55.35). This suggests a robust capability in instruction following and general conversational tasks.

Intended Uses

This model is suitable for a variety of natural language processing applications requiring a compact yet capable instruction-tuned model. Its optimized training process makes it a strong candidate for tasks where efficient and high-quality responses are needed. Developers can integrate it using standard Hugging Face transformers pipelines, ensuring reproducibility with the provided chat template for optimal performance.

Overview

Model Overview

Key Capabilities & Performance

Intended Uses

Full Model Card (README)