Name: FuseAI/FuseChat-Llama-3.1-8B-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FuseAI

FuseChat-Llama-3.1-8B-SFT: Implicit Model Fusion for Enhanced Performance

FuseChat-Llama-3.1-8B-SFT is an 8 billion parameter instruction-tuned model developed by FuseAI, designed to enhance the capabilities of smaller LLMs by implicitly learning from multiple robust open-source LLMs. This model utilizes a novel implicit model fusion (IMF) method, a two-stage training pipeline that includes Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO).

Key Capabilities

Enhanced Instruction Following: Achieved significant improvements of 37.1 points on AlpacaEval-2 and 30.1 points on Arena-Hard compared to its base Llama-3.1-8B-Instruct.
Broad Task Proficiency: Demonstrates substantial gains across general conversation, mathematics, and coding tasks.
Knowledge Integration: Effectively transfers capabilities from powerful source LLMs (Gemma-2-27B-It, Mistral-Large-Instruct-2407, Qwen-2.5-72B-Instruct, Llama-3.1-70B-Instruct) into a more compact 8B parameter model.
Competitive Performance: Outperforms AllenAI's Llama-3.1-Tulu-3-8B on most benchmarks, with an average improvement of 6.8 points across 14 diverse benchmarks.

Good for

General-purpose conversational AI: Excels in instruction following and general conversation scenarios.
Mathematical problem-solving: Shows strong performance in mathematics benchmarks.
Code generation and understanding: Improved capabilities in coding tasks.
Resource-efficient deployment: Offers enhanced performance in a more compact 8B parameter size, making it suitable for applications where larger models are impractical.
Developers seeking robust, instruction-tuned models: Provides a strong foundation for various AI applications requiring high-quality responses and adherence to instructions.

Overview

FuseChat-Llama-3.1-8B-SFT: Implicit Model Fusion for Enhanced Performance

Key Capabilities

Good for

Full Model Card (README)