Name: snowman0919/Qwopus3.6-27B-v2-heretic API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: snowman0919

Model Overview

snowman0919/Qwopus3.6-27B-v2-heretic is a 27 billion parameter dense language model, a decensored variant of Jackrong/Qwopus3.6-27B-v2, based on Alibaba Cloud's Qwen3.6-27B. It is specifically fine-tuned using a novel Trace Inversion methodology to reconstruct detailed, step-by-step reasoning pathways from commercial LLM outputs, addressing the "Reasoning Bubbles" dilemma where intermediate logical steps are often compressed or hidden.

Key Capabilities & Features

Reasoning Enhancement: Utilizes Trace Inversion datasets (e.g., claude-opus-4.6/4.7-traceInversion) to create Learnable Chain-of-Thought (CoT) traces, improving logical continuity and eliminating reasoning fractures.
Efficiency: Demonstrates significant reasoning efficiency, requiring 35.9% fewer tokens on average for correct answers compared to the base Qwen3.6-27B, and achieving a 16.6% increase in correct answers per 10,000 output tokens.
Performance: Achieves 87.43% accuracy on a selected MMLU-Pro subset (vs. 84.86% for base model) and 75.25% on SWE-bench Verified (controlled-202 slice).
Multimodal & Agentic Support: Natively supports vision and tool-use capabilities, making it suitable for agentic workflows, deep logic reasoning, and creative coding tasks like Web Design and WebGL Canvas generation.
Curriculum Learning: Trained with a three-stage curriculum to progressively scale reasoning quality and context length up to 32K tokens, ensuring format stability and handling complex multi-turn dialogues.

What Makes This Model Different?

This model's primary differentiator is its Trace Inversion technique, which reverse-engineers compressed "Reasoning Bubbles" from powerful commercial models into explicit, learnable Chain-of-Thought (CoT) traces. This approach allows the model to learn continuous, rigorous logical derivations, rather than just mimicking high-level summaries, leading to more robust and efficient reasoning. It also offers a decensored output compared to its original version.

Should I use this for my use case?

Use for: Applications requiring strong, explicit logical reasoning, complex problem-solving, agentic task execution, and code generation (especially for web design and creative coding). Its enhanced reasoning efficiency can lead to lower token costs for accurate outputs. The decensored nature might be suitable for research or specific creative applications where content filtering is not desired.
Consider alternatives if: Your primary need is raw inference throughput, as the dense 27B model has lower throughput than MoE architectures (e.g., Qwopus 3.6 35B-A3B MoE). Also, note that this is an experimental community release and has not undergone complete safety evaluations.

Overview

Model Overview

Key Capabilities & Features

What Makes This Model Different?

Should I use this for my use case?

Full Model Card (README)