openchat/openchat
OpenChat is a series of 13 billion parameter open-source language models, including OpenChat and OpenChat-8192, fine-tuned on a small, high-quality dataset of multi-round conversations. Based on LLaMA, these models achieve high performance on benchmarks like Vicuna GPT-4 evaluation and AlpacaEval, demonstrating strong conversational abilities. OpenChat-8192 extends the context length to 8192 tokens, making it suitable for longer interactions.
Loading preview...
OpenChat: High-Performance Conversational Models
OpenChat is a family of open-source language models, primarily based on the LLaMA-13B architecture, designed for multi-round conversational tasks. A key differentiator is its efficient training approach, utilizing a highly curated dataset of only ~6K GPT-4 conversations, significantly less data than many comparable models.
Key Capabilities & Performance
- Efficient Fine-tuning: Achieves strong performance with a limited, high-quality dataset of multi-round conversations.
- Strong Conversational AI: The base OpenChat model (13B parameters, 2048 context) scores 105.7% of ChatGPT on Vicuna GPT-4 evaluation and an 80.9% win-rate on AlpacaEval.
- Extended Context: OpenChat-8192 (13B parameters) extends the context length to 8192 tokens, maintaining high performance with 106.6% of ChatGPT on Vicuna GPT-4 evaluation and a 79.5% win-rate on AlpacaEval.
- Code Models: The series also includes
OpenCoderPlus, based on StarCoderPlus, optimized for code generation with native 8192 context and competitive performance against ChatGPT.
Use Cases
- General-purpose Chatbots: Excellent for building highly capable conversational agents.
- Applications requiring longer context: OpenChat-8192 is suitable for use cases demanding extended conversational memory.
- Code Generation:
OpenCoderPlusis specifically designed for programming-related tasks.