reaperdoesntknow/DistilQwen3-1.7B-uncensored

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Warm

reaperdoesntknow/DistilQwen3-1.7B-uncensored is a 1.7 billion parameter model from the DistilQwen3 series by Convergent Intelligence LLC: Research Division. This model is part of a proof-weighted distillation chain built on Discrepancy Calculus, a measure-theoretic framework designed to improve structural understanding. It is specifically fine-tuned for instruction following, structured output, and legal reasoning, leveraging a 30B-parameter Qwen3 teacher model.

Loading preview...

Model Overview

reaperdoesntknow/DistilQwen3-1.7B-uncensored is a 1.7 billion parameter model developed by Convergent Intelligence LLC: Research Division, forming part of their DistilQwen3 series. This model is a product of a sophisticated distillation process rooted in Discrepancy Calculus (DISC), a measure-theoretic framework. DISC aims to decompose a teacher's output distribution to quantify local structural mismatches, which standard KL divergence might overlook. The underlying theory, "On the Formal Analysis of Discrepancy Calculus" (Colca, 2026), emphasizes structural understanding over surface-level pattern matching.

Key Capabilities & Methodology

  • Proof-Weighted Distillation: The model utilizes a unique proof-weighted knowledge distillation method, combining 55% cross-entropy with decaying proof weights (2.5x to 1.5x) and 45% KL divergence at T=2.0. This approach amplifies loss on reasoning-critical tokens, compelling the student model to prioritize structural understanding.
  • Teacher Model: Distilled from a powerful 30B-parameter Qwen3-30B-A3B (Instruct) teacher, ensuring high-quality knowledge transfer.
  • Hardware & Precision: Unlike other models in the broader Convergent Intelligence catalog, the DistilQwen series was trained on H100 GPUs at BF16 precision, indicating a focus on leveraging premium compute for enhanced performance.

Use Cases

This model is particularly well-suited for tasks requiring:

  • Instruction Following: Excelling at adhering to complex instructions.
  • Structured Output: Generating responses in specific, predefined formats.
  • Legal Reasoning: Demonstrating capabilities in tasks involving legal analysis and inference.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p