Name: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: UCLA-AGI

Overview

UCLA-AGI/Gemma-2-9B-It-SPPO-Iter2 is a 9 billion parameter instruction-tuned language model developed by UCLA-AGI. It is based on the google/gemma-2-9b-it architecture and has undergone a specialized fine-tuning process called Self-Play Preference Optimization (SPPO) for alignment, specifically at its second iteration. The training utilized synthetic responses generated from the openbmb/UltraFeedback dataset, split across multiple iterations.

Key Characteristics

Architecture: 9 billion parameter GPT-like model.
Fine-tuning Method: Self-Play Preference Optimization (SPPO) for improved alignment.
Training Data: Leverages synthetic datasets derived from UltraFeedback prompts.
Language: Primarily English.
Context Length: Supports a context length of 16384 tokens.
License: Apache-2.0.

Performance Insights

While specific benchmark results for Iteration 2 are not directly provided in comparison to other models, the related SPPO iterations on Llama-3-8B show progressive improvements in AlpacaEval win rates, suggesting the SPPO method enhances model performance in instruction-following scenarios. For instance, Llama-3-8B-SPPO Iter2 achieved a 50.93% LC. Win Rate and 44.64% Win Rate on AlpacaEval.

Use Cases

This model is suitable for general instruction-following tasks where a robustly aligned model is beneficial. Its fine-tuning approach aims to produce more helpful and harmless outputs, making it a strong candidate for applications requiring high-quality conversational AI or content generation based on user prompts.

Overview

Overview

Key Characteristics

Performance Insights

Use Cases

Full Model Card (README)