Name: UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: UCLA-AGI

UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 Overview

This model is a 7 billion parameter language model developed by UCLA-AGI, representing the second iteration of a self-play fine-tuning (SPIN) process. It builds upon the alignment-handbook/zephyr-7b-sft-full base model, which itself is derived from mistralai/Mistral-7B-v0.1. The fine-tuning process utilizes synthetic data generated from the HuggingFaceH4/ultrachat_200k dataset, aiming to convert a weaker language model into a stronger one through iterative self-improvement.

Key Capabilities & Performance

Self-Play Fine-Tuning (SPIN): Employs an advanced training methodology detailed in the paper "Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models" (arXiv:2401.01335).
Benchmark Performance: Achieves an average score of 63.54 on the Open LLM Leaderboard. Notable scores include:
- ARC (25-shot): 66.47
- HellaSwag (10-shot): 85.82
- MMLU (5-shot): 61.48
- TruthfulQA (0-shot): 57.75
- Winogrande (5-shot): 76.95
- GSM8K (5-shot): 32.75
Language Support: Primarily English.
Context Length: Supports a context length of 8192 tokens.

Training Details

The model was trained with a learning rate of 1e-07, a batch size of 8 across 8 GPUs (total batch size 64), using the RMSProp optimizer and a linear learning rate scheduler over 2 epochs. This iterative fine-tuning approach, leveraging synthetic data, is a key differentiator in its development.

Overview

UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 Overview

Key Capabilities & Performance

Training Details

Full Model Card (README)