Merlinite-7b: An Aligned Mistral-7B Derivative
Merlinite-7b is a 7-billion parameter language model developed by IBM Research, based on the Mistral-7B-v0.1 architecture. It stands out due to its novel alignment tuning method called LAB (Large-scale Alignment for chatBots), which uses Mixtral-8x7B-Instruct as a teacher model.
Key Capabilities & Features
- LAB Methodology: Employs a unique three-component approach for alignment:
- Taxonomy-driven data curation: Uses a tree of seed examples to prompt the teacher model, ensuring diverse and targeted synthetic data generation across knowledge domains and skills.
- Large-scale synthetic data generation: Efficiently creates high-quality training data by sampling local examples within the taxonomy, allowing competitive performance even with a smaller teacher model like Mixtral-8x7B compared to GPT-4.
- Two-phased training with replay buffers: Prevents catastrophic forgetting and enables incremental learning of new knowledge and skills.
- Performance: Achieves a MTBench (Avg) score of 7.66 and an MMLU (5-shot) score of 64.88, outperforming several 7B and 13B models including Mistral-7B-Instruct-v0.2 and Orca-2-13b in certain benchmarks.
- Incremental Learning: Designed to add new knowledge and skills without degrading previously learned capabilities.
Good For
- Conversational AI: Its alignment method and performance make it suitable for chatbot applications.
- Research in Alignment: Demonstrates an effective synthetic data-based alignment strategy for LLMs.
- Applications requiring continuous learning: The LAB approach supports adding new domain-specific knowledge and skills incrementally.