Overview
DiStil-Qwen3-1.7B-uncensored is a 1.7 billion effective parameter model from Convergent Intelligence LLC: Research Division, built on the Qwen3ForCausalLM architecture. It features a substantial 40,960 token context length, 28 layers, and a hidden size of 2048. This model is a key component in a distillation chain, with a subsequent refinement available as Disctil-Qwen3-1.7B.
Key Capabilities
- Alignment-Free Responses: The model is specifically fine-tuned using uncensored instruction data to remove refusal behaviors often imposed by alignment training, aiming to respond directly to prompts without filtering through safety heuristics.
- Preserved Reasoning: Despite the uncensored distillation, it retains the core reasoning and generation capabilities of the base Qwen3 model.
- Pure SFT Intervention: Training involved Supervised Fine-Tuning (SFT) using TRL, without any architectural modifications, focusing solely on shifting the model's response distribution.
- Discrepancy Calculus Foundation: This model is part of a series developed using Discrepancy Calculus, a measure-theoretic framework for analyzing and transferring output distributions, as detailed in the "Structure Over Scale" methodology.
Good For
- Use cases requiring direct, unfiltered responses to technical, analytical, or research queries.
- Developers interested in exploring models with reduced alignment-imposed refusal behaviors.
- Applications where preserving the base model's raw generation capabilities is prioritized over strict safety alignments.