UCLA-AGI/zephyr-7b-sft-full-SPIN-iter3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Jan 7, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

UCLA-AGI/zephyr-7b-sft-full-SPIN-iter3 is a 7 billion parameter GPT-like language model developed by UCLA-AGI, fine-tuned using a self-play fine-tuning (SPIN) method at iteration 3. Based on Mistral-7B-v0.1, it leverages synthetic data from the HuggingFaceH4/ultrachat_200k dataset to enhance its performance. This model is primarily English-language and is designed for general-purpose natural language understanding and generation tasks, demonstrating an average score of 63.70 on the Open LLM Leaderboard.

Loading preview...

Model Overview

UCLA-AGI/zephyr-7b-sft-full-SPIN-iter3 is a 7 billion parameter language model developed by UCLA-AGI, building upon the alignment-handbook/zephyr-7b-sft-full model, which itself is based on mistralai/Mistral-7B-v0.1. This model distinguishes itself through its self-play fine-tuning (SPIN) approach, specifically at its third iteration, utilizing synthetic data derived from the HuggingFaceH4/ultrachat_200k dataset. This training methodology aims to convert weaker language models into stronger ones through iterative self-improvement.

Key Capabilities & Performance

  • Self-Play Fine-Tuning (SPIN): Leverages a novel training method where the model improves by playing against itself, generating its own training data.
  • Synthetic Data Training: Fine-tuned on synthetic datasets, enhancing its ability to understand and generate human-like text.
  • General Language Tasks: Primarily English-language, suitable for a wide range of natural language processing applications.
  • Open LLM Leaderboard Performance: Achieves an average score of 63.70, with notable scores including 85.85 on HellaSwag (10-shot) and 66.13 on ARC (25-shot).

Ideal Use Cases

  • Research and Development: Excellent for exploring the impact of self-play fine-tuning on model capabilities.
  • General-Purpose NLP: Suitable for applications requiring robust English language understanding and generation.
  • Benchmarking: Can serve as a strong baseline for comparing new models or fine-tuning techniques, especially within the 7B parameter class.