saidutta69/Qwen2.5-Coder-3B-Instruct-heretic

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 25, 2026License:qwen-researchArchitecture:Transformer0.0K Cold

The saidutta69/Qwen2.5-Coder-3B-Instruct-heretic is a 3.09 billion parameter instruction-tuned causal language model based on the Qwen2.5-Coder architecture, developed by Qwen. This specific version is a decensored variant of the original Qwen2.5-Coder-3B-Instruct, created using the Heretic v1.2.0 tool. It excels in code generation, code reasoning, and code fixing, maintaining a 32,768-token context length. Its primary differentiation lies in its reduced refusal rate compared to the original model, making it suitable for coding tasks requiring less content moderation.

Loading preview...

Overview

This model, saidutta69/Qwen2.5-Coder-3B-Instruct-heretic, is a 3.09 billion parameter instruction-tuned causal language model derived from the Qwen2.5-Coder series by Qwen. It is specifically a decensored version of the original Qwen/Qwen2.5-Coder-3B-Instruct, created using the Heretic v1.2.0 tool. The base Qwen2.5-Coder models are designed for code-specific tasks, building upon the strong Qwen2.5 foundation with 5.5 trillion training tokens, including extensive source code and synthetic data.

Key Capabilities

  • Enhanced Code Generation, Reasoning, and Fixing: Significant improvements over previous CodeQwen versions.
  • Long Context Window: Supports a full 32,768 tokens, beneficial for complex coding projects.
  • Reduced Refusals: This 'heretic' variant shows a dramatically lower refusal rate (3/100) compared to the original model (100/100), indicating less content moderation.
  • Foundation for Code Agents: Designed to support real-world applications like Code Agents, while maintaining strengths in mathematics and general competencies.

Good For

  • Developers seeking a powerful, smaller-scale code-focused LLM with a long context window.
  • Applications requiring code generation, debugging, or reasoning without strict content filtering.
  • Experimentation with instruction-tuned models for programming tasks where the original model's refusal rate was a limitation.