DiStil-Qwen3-1.7B-uncensored: Alignment-Free Distillation

This model, developed by Convergent Intelligence LLC: Research Division, is a 1.7 billion parameter Qwen3ForCausalLM variant. It's a distilled version of Qwen3, specifically fine-tuned using uncensored Supervised Fine-Tuning (SFT) data. The primary goal of this distillation is to eliminate refusal behaviors often imposed by alignment training, ensuring the model responds directly to prompts without filtering through safety heuristics.

Key Capabilities & Features

Uncensored Responses: Designed to provide direct answers to prompts, including technical, analytical, and research queries, without refusal patterns.
Preserved Capabilities: Maintains the base Qwen3 model's reasoning and generation abilities.
Architecture: Based on Qwen3ForCausalLM with approximately 2.03 billion parameters (1.7B effective) and a substantial context length of 40,960 tokens.
Training Method: Utilizes TRL for supervised fine-tuning on uncensored instruction data, preserving the original Qwen3 architecture and tokenizer.
Discrepancy Calculus (DISC): This model is part of a distillation chain leveraging Discrepancy Calculus, a measure-theoretic framework for analyzing and transferring output distributions, as detailed in "On the Formal Analysis of Discrepancy Calculus" (Colca, 2026).

When to Use This Model

Research & Technical Queries: Ideal for applications requiring unfiltered, direct responses to complex technical or analytical questions.
Development of Downstream Models: Serves as a base model in a distillation chain, with further refinements like Disctil-Qwen3-1.7B available.
Exploration of Alignment-Free LLMs: Suitable for users interested in models that prioritize raw capability transfer over imposed safety alignments.

Overview

DiStil-Qwen3-1.7B-uncensored: Alignment-Free Distillation

Key Capabilities & Features

When to Use This Model

Full Model Card (README)