FuseAI/OpenChat-3.5-7B-Mixtral: A Fused Chat Model
This model is a 7 billion parameter chat LLM developed by FuseAI, part of the larger FuseChat initiative. It represents a pairwise knowledge fusion between two prominent chat models: OpenChat-3.5-7B and Nous-Hermes-2-Mixtral-8x7B-DPO. This fusion process aims to combine the collective knowledge and individual strengths of diverse LLMs into a more powerful, single model without the increased memory requirements of Mixture of Experts (MoE) models.
Key Capabilities & Features
- Knowledge Fusion: Utilizes a "fuse-then-merge" strategy to integrate knowledge from multiple source LLMs.
- Memory Efficiency: Unlike MoE models, it consolidates knowledge into a single LLM, avoiding additional memory overhead during inference.
- Strong MT-Bench Performance: As a component of the FuseChat framework, it contributes to the overall high performance on the MT-Bench benchmark, with FuseChat-7B-VaRM achieving 8.22.
- Flexible Integration: Designed to support a "plug-and-play" fusion of new source LLMs.
Evaluation & Performance
While FuseAI/OpenChat-3.5-7B-Mixtral is an intermediate target LLM, the full FuseChat-7B-VaRM model, which incorporates this and other fused models, demonstrates competitive performance. FuseChat-7B-VaRM scores 8.22 on MT-Bench, outperforming models like Starling-7B and Yi-34B-Chat, and even surpassing GPT-3.5 (March) and Claude-2.1.
Good For
- Developers interested in memory-efficient chat models that leverage the strengths of multiple LLMs.
- Applications requiring a robust 7B parameter model for general conversational AI.
- Research into model merging and knowledge fusion techniques.