FuseAI/FuseChat-Llama-3.2-3B-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Dec 6, 2024Architecture:Transformer0.0K Cold

FuseAI/FuseChat-Llama-3.2-3B-Instruct is a 3 billion parameter instruction-tuned language model developed by FuseAI, part of the FuseChat-3.0 series. This model is created through an implicit model fusion process, leveraging Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to integrate capabilities from larger source LLMs like Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct into a more compact Llama-3.2-3B-Instruct base. It excels in general conversation, instruction following, mathematics, and coding tasks, demonstrating an average performance improvement of 5.0 points over the base Llama-3.2-3B-Instruct across 14 benchmarks.

Loading preview...

FuseChat-Llama-3.2-3B-Instruct: Implicit Model Fusion for Enhanced Performance

FuseChat-Llama-3.2-3B-Instruct is a 3 billion parameter model from the FuseChat-3.0 series, developed by FuseAI. It represents a novel approach to enhancing smaller LLMs by implicitly fusing the strengths of multiple larger, more robust source models. This is achieved through a two-stage training pipeline: Supervised Fine-Tuning (SFT) to align the target model with high-quality responses, and Direct Preference Optimization (DPO) to learn preferences from diverse source LLMs.

Key Capabilities

  • Implicit Model Fusion (IMF): Transfers capabilities from powerful source LLMs (Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, Llama-3.1-70B-Instruct) to a smaller target model (Llama-3.2-3B-Instruct) without explicit knowledge transfer challenges.
  • Two-Stage Training: Utilizes SFT to reduce distributional discrepancies and DPO with preference pairs derived from source models' best and worst responses.
  • Broad Task Improvement: Demonstrates enhanced performance in general conversation, instruction following, mathematics, and coding.
  • Specialized Dataset Construction: Trained on a diverse dataset including UltraFeedback, Magpie-Pro-DPO, HelpSteer2 for instruction following; OpenMathInstruct-2 for mathematics; Leetcode and self-oss-instruct-sc2 for coding; and Chinese-specific datasets.

Good for

  • Resource-constrained environments: Offers improved capabilities in a compact 3B parameter size.
  • Applications requiring strong instruction following: Achieved a 54.0% on AlpacaEval-2 and 30.2% on Arena-Hard, significantly outperforming the base Llama-3.2-3B-Instruct.
  • Mathematical and coding tasks: Shows notable improvements in MATH (53.1%), AMC23 (35.0%), and LiveCodeBench (9.0%).
  • Developers seeking a versatile small model: Provides a balanced improvement across a wide range of benchmarks, with an average score of 40.2% across 14 benchmarks.