Name: SakanaAI/DiscoPOP-zephyr-7b-gemma API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SakanaAI

SakanaAI DiscoPOP-zephyr-7b-gemma: A Novel Preference Optimized LLM

This model, developed by SakanaAI, is an 8.5 billion parameter language model based on the Gemma architecture, specifically fine-tuned from HuggingFaceH4/zephyr-7b-gemma-sft-v0.1. Its core differentiator lies in its use of DiscoPOP (Discovered Preference Optimization), an algorithm developed by SakanaAI, as an alternative to traditional Direct Preference Optimization (DPO).

Key Capabilities & Features

Novel Optimization Algorithm: Employs DiscoPOP, a unique preference optimization method, for fine-tuning, as detailed in the paper "Discovering Preference Optimization Algorithms with and for Large Language Models".
Base Model: Built upon the robust zephyr-7b-gemma-sft-v0.1 foundation.
Context Length: Supports an 8192-token context window.
Training Details: Fine-tuned over 2 epochs with a learning rate of 5e-07 and a total batch size of 128, using Adam optimizer with cosine learning rate scheduler.

When to Consider This Model

Exploring Advanced Preference Optimization: Ideal for researchers and developers interested in evaluating or utilizing novel preference optimization techniques beyond DPO.
General Language Tasks: Suitable for a wide range of applications where a 7B-class model with strong instruction following capabilities is required.
Reproducibility and Research: The associated paper and codebase provide transparency for research and experimentation.