Name: Nexusflow/Starling-LM-7B-beta API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Nexusflow

Starling-LM-7B-beta: RLAIF-Tuned for Enhanced Performance

Starling-LM-7B-beta is a 7 billion parameter language model developed by The Nexusflow Team. It is fine-tuned from Openchat-3.5-0106, which itself is based on Mistral-7B-v0.1.

Key Capabilities & Training:

RLAIF Training: The model is trained using Reinforcement Learning from AI Feedback (RLAIF), a method that utilizes a reward model to optimize for helpfulness and harmlessness.
Advanced Reward Model: It incorporates a new reward model, Nexusflow/Starling-RM-34B, trained on the berkeley-nest/Nectar ranking dataset.
Performance: Starling-LM-7B-beta achieves an improved 8.12 score on MT Bench (evaluated by GPT-4), indicating strong conversational abilities.
Chat Template Adherence: Requires a specific chat template for optimal performance, identical to Openchat-3.5-0106, supporting single-turn, multi-turn, and coding conversations.

When to Use This Model:

General Conversational AI: Ideal for applications requiring robust and helpful dialogue generation.
Research in RLAIF: Useful for researchers exploring advanced reinforcement learning techniques for language models.
Benchmarking: Can serve as a strong baseline for evaluating conversational AI systems, particularly given its MT Bench score.

Note: Users must adhere to the specified chat template for best results. The model is available for testing on LMSYS Chatbot Arena.

Overview

Starling-LM-7B-beta: RLAIF-Tuned for Enhanced Performance

Key Capabilities & Training:

When to Use This Model:

Full Model Card (README)