Name: FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: FuseAI

FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview Overview

This model, developed by FuseAI, is a 32.8 billion parameter language model focused on enhancing System-II reasoning capabilities through advanced model fusion. It specifically utilizes a 'Long-Short Reasoning Merging' approach, combining the strengths of long-CoT (Chain-of-Thought) and short-CoT LLMs. The source models integrated are DeepSeek-R1-Distill-Qwen-32B and Qwen2.5-32B-Instruct.

Key Capabilities & Performance

Enhanced Reasoning: Designed to improve reasoning in both long and short reasoning processes.
Mathematics: Achieves 68.6 Pass@1 and 83.3 Cons@32 on AIME24, and 94.6 on MATH500, demonstrating significant improvements over its constituent models.
Scientific Reasoning: Scores 55.1 on GPQA-Diamond and 68.6 on MMLU-Pro.
Code Reasoning: While not the primary coder variant, it contributes to the FuseO1 family's overall code reasoning capabilities.

Unique Approach

This model is part of FuseAI's initial endeavor to fuse multiple open-source LLMs using their advanced SCE merging methodologies. The goal is to consolidate distinct knowledge and strengths from various reasoning LLMs into a single, unified model with robust System-II reasoning abilities across mathematics, coding, and science domains.

When to Use This Model

Complex Reasoning Tasks: Ideal for applications requiring robust step-by-step reasoning, especially in mathematical problem-solving.
Instruction Following: Benefits from the instruction-tuned component, making it suitable for general instruction-following tasks.
Research & Development: Useful for exploring the benefits of model fusion for enhanced reasoning.

Overview

FuseO1-DeepSeekR1-Qwen2.5-Instruct-32B-Preview Overview

Key Capabilities & Performance

Unique Approach

When to Use This Model

Full Model Card (README)