Name: prithivMLmods/Qwen2.5-32B-DeepSeek-R1-Instruct API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: prithivMLmods

Overview

prithivMLmods/Qwen2.5-32B-DeepSeek-R1-Instruct is a 32.8 billion parameter language model created by prithivMLmods. It is a product of a sophisticated merge operation using the TIES merge method via MergeKit.

Merge Details

This model uses Qwen/Qwen2.5-32B-Instruct as its foundational base. It strategically combines two distinct models:

Qwen/QwQ-32B-Preview
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

Both merged models contribute with equal weight and density, aiming to synthesize their respective strengths. The merge configuration includes specific optimizations such as normalization, int8 masking, and bfloat16 precision, which are crucial for maintaining and enhancing performance post-merge.

Key Characteristics

Architecture: Based on the Qwen2.5-32B-Instruct family.
Parameter Count: 32.8 billion parameters.
Context Length: Supports a context length of 131,072 tokens.
Merge Method: Utilizes the TIES (Trimmed, Iterative, and Selective) merge method for combining model weights.
Precision: Optimized for bfloat16 operations.

Intended Use

This model is suitable for developers and researchers looking for a powerful instruction-following model that integrates the capabilities of multiple high-performing base models. Its merged nature suggests a balanced performance across various tasks, benefiting from the diverse training data and architectural nuances of its components.

Overview

Overview

Merge Details

Key Characteristics

Intended Use

Full Model Card (README)