xDAN-L1-Chat-RL-v1: A 7B Chat Model with Top MT-bench Performance
xDAN-L1-Chat-RL-v1, developed by xDAN-AI, is a 7 billion parameter chat-optimized language model that has achieved a top-ranking position on the MT-bench leaderboard. This model is distinguished by its strong performance in humanistic tasks, coding, and writing, making it a versatile option for various applications.
Key Capabilities & Performance
- MT-bench Leader: Achieves an average score of 8.350000 on MT-bench, outperforming models like GPT-3.5 Turbo (20B) and Claude-v1 in overall performance.
- Competitive with GPT-4: Demonstrates performance close to GPT-4 on the first turn of MT-bench, scoring 8.87500 compared to GPT-4's 8.95625.
- Broad Task Proficiency: Excels across humanistic tasks, coding, and writing, as highlighted by its MT-bench results.
- LM-Evaluation-Harness Scores: Achieves an average score of 68.38, with notable results in HellaSwag (85.81), Winogrande (78.85), and MMLU (63.21).
Training Details
- Dataset: Trained using a mixed dataset including selections from OpenOrca, Intel Orca-DPO-Pairs, and a privately crafted dataset.
- Methodology: Utilizes Supervised Fine-Tuning (SFT) with the mixed dataset, followed by DPO-v2 dataset and Trainer for further optimization.
Recommended Use Cases
- Chatbots and Conversational AI: Its strong chat optimization and MT-bench performance make it suitable for building responsive and capable conversational agents.
- Content Generation: Effective for tasks requiring creative writing or general text generation.
- Coding Assistance: Demonstrates proficiency in coding-related tasks, useful for developers seeking AI support.
- Resource-Constrained Environments: As a 7B model, it offers competitive performance with a smaller footprint compared to larger alternatives.