Name: FuseAI/FuseChat-Qwen-2.5-7B-Instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FuseAI

FuseChat-Qwen-2.5-7B-Instruct: Implicit Model Fusion

FuseChat-Qwen-2.5-7B-Instruct is a 7.6 billion parameter model from the FuseChat-3.0 series, developed by FuseAI. This model utilizes an innovative implicit model fusion (IMF) approach to transfer capabilities from powerful source LLMs (Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, and Llama-3.1-70B-Instruct) into a smaller Qwen-2.5-7B-Instruct target model.

Key Capabilities & Training

The IMF process involves a two-stage training pipeline:

Supervised Fine-Tuning (SFT): Mitigates distribution discrepancies by fine-tuning on high-quality responses from source models.
Direct Preference Optimization (DPO): Learns preferences from multiple source LLMs using best and worst response pairs, further enhancing performance.

The model was trained on a diverse dataset of 158,667 entries, covering instruction following, general conversation, mathematics, coding, and Chinese language tasks. This includes data from UltraFeedback, OpenMathInstruct-2, and LeetCode, with responses sampled from the larger source models and annotated using an external reward model like ArmoRM.

Performance Highlights

FuseChat-Qwen-2.5-7B-Instruct demonstrates significant improvements across various benchmarks, particularly in instruction following. For instance, it achieved 63.6% on AlpacaEval-2 and 61.4% on Arena-Hard, marking substantial gains over the base Qwen-2.5-7B-Instruct model. While showing strong performance in areas like MT-Bench and AMC 23, it also maintains competitive scores in mathematics and coding, with an overall average improvement across 14 benchmarks.

Overview

FuseChat-Qwen-2.5-7B-Instruct: Implicit Model Fusion

Key Capabilities & Training

Performance Highlights

Full Model Card (README)