reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 28, 2026Architecture:Transformer0.0K Warm

DiStil-Qwen3-1.7B-uncensored is a 1.7 billion parameter Qwen3ForCausalLM model developed by Convergent Intelligence LLC: Research Division. It is a distillation of Qwen3, fine-tuned with uncensored SFT data to remove alignment-imposed refusal behaviors while retaining reasoning and generation capabilities. This model features a 40,960 token context length and is designed to respond directly to prompts without filtering through safety heuristics, making it suitable for technical, analytical, and research queries.

Loading preview...

DiStil-Qwen3-1.7B-uncensored: Alignment-Free Distillation

This model, developed by Convergent Intelligence LLC: Research Division, is a 1.7 billion parameter Qwen3ForCausalLM variant. It's a distilled version of Qwen3, specifically fine-tuned using uncensored Supervised Fine-Tuning (SFT) data. The primary goal of this distillation is to eliminate refusal behaviors often imposed by alignment training, ensuring the model responds directly to prompts without filtering through safety heuristics.

Key Capabilities & Features

  • Uncensored Responses: Designed to provide direct answers to prompts, including technical, analytical, and research queries, without refusal patterns.
  • Preserved Capabilities: Maintains the base Qwen3 model's reasoning and generation abilities.
  • Architecture: Based on Qwen3ForCausalLM with approximately 2.03 billion parameters (1.7B effective) and a substantial context length of 40,960 tokens.
  • Training Method: Utilizes TRL for supervised fine-tuning on uncensored instruction data, preserving the original Qwen3 architecture and tokenizer.
  • Discrepancy Calculus (DISC): This model is part of a distillation chain leveraging Discrepancy Calculus, a measure-theoretic framework for analyzing and transferring output distributions, as detailed in "On the Formal Analysis of Discrepancy Calculus" (Colca, 2026).

When to Use This Model

  • Research & Technical Queries: Ideal for applications requiring unfiltered, direct responses to complex technical or analytical questions.
  • Development of Downstream Models: Serves as a base model in a distillation chain, with further refinements like Disctil-Qwen3-1.7B available.
  • Exploration of Alignment-Free LLMs: Suitable for users interested in models that prioritize raw capability transfer over imposed safety alignments.