Nexusflow/Starling-LM-7B-beta

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 19, 2024License:apache-2.0Architecture:Transformer0.3K Open Weights Warm

Starling-LM-7B-beta is a 7 billion parameter language model developed by The Nexusflow Team, fine-tuned from Openchat-3.5-0106 (based on Mistral-7B-v0.1) using Reinforcement Learning from AI Feedback (RLAIF). This model leverages a new reward model, Starling-RM-34B, and the Nectar ranking dataset to achieve an improved 8.12 score on MT Bench with GPT-4 as a judge. It is optimized for generating helpful and harmless responses, making it suitable for general conversational AI applications.

Loading preview...

Starling-LM-7B-beta: RLAIF-Tuned for Enhanced Performance

Starling-LM-7B-beta is a 7 billion parameter language model developed by The Nexusflow Team. It is fine-tuned from Openchat-3.5-0106, which itself is based on Mistral-7B-v0.1.

Key Capabilities & Training:

  • RLAIF Training: The model is trained using Reinforcement Learning from AI Feedback (RLAIF), a method that utilizes a reward model to optimize for helpfulness and harmlessness.
  • Advanced Reward Model: It incorporates a new reward model, Nexusflow/Starling-RM-34B, trained on the berkeley-nest/Nectar ranking dataset.
  • Performance: Starling-LM-7B-beta achieves an improved 8.12 score on MT Bench (evaluated by GPT-4), indicating strong conversational abilities.
  • Chat Template Adherence: Requires a specific chat template for optimal performance, identical to Openchat-3.5-0106, supporting single-turn, multi-turn, and coding conversations.

When to Use This Model:

  • General Conversational AI: Ideal for applications requiring robust and helpful dialogue generation.
  • Research in RLAIF: Useful for researchers exploring advanced reinforcement learning techniques for language models.
  • Benchmarking: Can serve as a strong baseline for evaluating conversational AI systems, particularly given its MT Bench score.

Note: Users must adhere to the specified chat template for best results. The model is available for testing on LMSYS Chatbot Arena.