Name: Columbia-NLP/LION-LLaMA-3-8b-dpo-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Columbia-NLP

Overview

Columbia-NLP/LION-LLaMA-3-8b-dpo-v1.0 is an 8 billion parameter language model, part of the LION-series developed by Columbia-NLP. It is fine-tuned from Columbia-NLP/LION-LLaMA-3-8b-sft-v1.0 using Direct Preference Optimization (DPO) as part of an empirically optimized three-stage pipeline (SFT, DPO, and online DPO). This model aims to improve performance through techniques like sequence packing and loss masking during SFT, and increasing preference dataset size in DPO.

Key Capabilities & Performance

Enhanced Alignment: Achieves strong alignment through its DPO fine-tuning stage, building upon an SFT base.
Competitive Benchmarks: Demonstrates competitive performance, with an MT-Bench score of 8.12 and an OpenLLM score of 71.28. It surpasses the official LLaMA-3-8b-it model in MT-Bench and OpenLLM scores.
Optimized Training: Benefits from an empirically optimized training pipeline designed to significantly improve language model performance.

Intended Use Cases

General Instruction Following: Designed for various instruction-following tasks, as indicated by its strong benchmark results.
Conversational AI: Suitable for generating human-like text in response to prompts, as shown in the provided chat template example.
Research and Development: Can be used by researchers interested in advanced alignment techniques and empirically optimized training pipelines. Further details on training datasets, code, and evaluation scripts are available in the associated paper and codebase.

Overview

Overview

Key Capabilities & Performance

Intended Use Cases

Full Model Card (README)