FuseChat-Llama-3.2-3B-Instruct: Implicit Model Fusion for Enhanced Performance
FuseChat-Llama-3.2-3B-Instruct is a 3 billion parameter model from the FuseChat-3.0 series, developed by FuseAI. It represents a novel approach to enhancing smaller LLMs by implicitly fusing the strengths of multiple larger, more robust source models. This is achieved through a two-stage training pipeline: Supervised Fine-Tuning (SFT) to align the target model with high-quality responses, and Direct Preference Optimization (DPO) to learn preferences from diverse source LLMs.
Key Capabilities
- Implicit Model Fusion (IMF): Transfers capabilities from powerful source LLMs (Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, Llama-3.1-70B-Instruct) to a smaller target model (Llama-3.2-3B-Instruct) without explicit knowledge transfer challenges.
- Two-Stage Training: Utilizes SFT to reduce distributional discrepancies and DPO with preference pairs derived from source models' best and worst responses.
- Broad Task Improvement: Demonstrates enhanced performance in general conversation, instruction following, mathematics, and coding.
- Specialized Dataset Construction: Trained on a diverse dataset including UltraFeedback, Magpie-Pro-DPO, HelpSteer2 for instruction following; OpenMathInstruct-2 for mathematics; Leetcode and self-oss-instruct-sc2 for coding; and Chinese-specific datasets.
Good for
- Resource-constrained environments: Offers improved capabilities in a compact 3B parameter size.
- Applications requiring strong instruction following: Achieved a 54.0% on AlpacaEval-2 and 30.2% on Arena-Hard, significantly outperforming the base Llama-3.2-3B-Instruct.
- Mathematical and coding tasks: Shows notable improvements in MATH (53.1%), AMC23 (35.0%), and LiveCodeBench (9.0%).
- Developers seeking a versatile small model: Provides a balanced improvement across a wide range of benchmarks, with an average score of 40.2% across 14 benchmarks.