huihui-ai/DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kArchitecture:Transformer0.0K Warm

huihui-ai/DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010 is a 32 billion parameter mixed model based on the Qwen2.5 architecture, created by huihui-ai. This experimental fusion combines DeepSeek-R1-Distill-Qwen-32B and Qwen2.5-Coder-32B-Instruct to enhance programming and code-related thinking abilities. It is designed for applications requiring robust code generation and understanding, leveraging the strengths of its constituent models.

Loading preview...

Model Overview

The DeepSeek-R1-Distill-Qwen-Coder-32B-Fusion-9010 is an experimental 32 billion parameter model developed by huihui-ai. It is a mixed model, combining two Qwen-based models: huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated (90%) and huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated (10%). This fusion aims to improve thinking capabilities specifically in programming and code-related tasks.

Key Characteristics

  • Architecture: Based on the Qwen2.5 architecture.
  • Parameter Count: 32 billion parameters.
  • Composition: A 9:1 blend of a general-purpose Qwen-based model and a code-focused Qwen2.5-Coder-Instruct model.
  • Stability: Despite being an experimental mix, the model is reported to be usable without generating gibberish.

Intended Use Cases

This model is particularly suited for:

  • Code Generation: Assisting with writing and generating programming code.
  • Code Understanding: Tasks requiring comprehension of code logic and structure.
  • Programming Assistance: General support for developers in various coding scenarios.
  • Experimental Applications: Users interested in exploring the performance of fused models for specialized tasks.