reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 28, 2026Architecture:Transformer0.0K Warm

DiStil-Qwen3-1.7B-uncensored is a 1.7 billion parameter Qwen3ForCausalLM model developed by Convergent Intelligence LLC: Research Division. It is a distillation of Qwen3, fine-tuned with uncensored SFT data to remove alignment-imposed refusal behaviors while retaining reasoning and generation capabilities. This model features a 40,960 token context length and is designed to respond directly to prompts without filtering through safety heuristics, making it suitable for technical, analytical, and research queries.

Loading preview...

Model Overview

DiStil-Qwen3-1.7B-uncensored is a 1.7 billion effective parameter model from Convergent Intelligence LLC: Research Division, built on the Qwen3ForCausalLM architecture. It is a distilled version of Qwen3, specifically fine-tuned using uncensored SFT data. The primary goal of this distillation is to eliminate alignment-imposed refusal behaviors, ensuring the model responds directly to prompts without filtering through safety heuristics, which can often misfire on legitimate technical or research queries.

Key Characteristics

  • Architecture: Qwen3ForCausalLM with approximately 2.03 billion parameters (1.7B effective).
  • Context Length: Features a substantial 40,960 token context window.
  • Training: Supervised fine-tuning (SFT) using TRL on uncensored instruction data, preserving the base Qwen3 architecture and tokenizer.
  • Alignment-Free: Designed to provide direct responses by shifting the model's response distribution away from refusal patterns, without architectural modifications.
  • Distillation Chain: This model serves as the base in a distillation chain, with further refinements like Disctil-Qwen3-1.7B built upon it.

Underlying Methodology

The model's development is rooted in Discrepancy Calculus (DISC), a measure-theoretic framework for analyzing and transferring output distributions. This advanced methodology, detailed in "On the Formal Analysis of Discrepancy Calculus" (Colca, 2026) and Structure Over Scale (DOI: 10.57967/hf/8165), quantifies local structural mismatches that standard divergence metrics might overlook.