Model Overview
darthcrawl/Qwen2.5-14B-Instruct-heretic is a 14.8 billion parameter instruction-tuned causal language model derived from the Qwen2.5 architecture. This specific version is a decensored variant of the original Qwen/Qwen2.5-14B-Instruct, processed with Heretic v1.2.0. Its primary differentiator is a drastic reduction in refusal rates, dropping from 98/100 in the original model to just 3/100, as measured by KL divergence of 0.2743.
Key Capabilities & Features
- Decensored Output: Significantly reduced refusal rates compared to the base model, enabling broader content generation.
- Enhanced Instruction Following: Improved ability to adhere to complex instructions and system prompts, beneficial for role-play and chatbot applications.
- Long Context Support: Features a native context window of 32,768 tokens, extendable up to 128,000 tokens using YaRN for processing very long texts.
- Multilingual Proficiency: Supports over 29 languages, including major global languages like Chinese, English, French, Spanish, German, and Japanese.
- Structured Data Handling: Excels at understanding structured data (e.g., tables) and generating structured outputs, particularly JSON.
- Improved Core Abilities: Builds upon Qwen2 with enhanced capabilities in coding, mathematics, and general knowledge.
When to Use This Model
This model is particularly well-suited for use cases requiring less restrictive content generation and high instruction adherence, where the base Qwen2.5-14B-Instruct might exhibit excessive refusals. Its strong multilingual support and ability to handle long contexts make it versatile for applications ranging from advanced chatbots and content creation to code generation and data processing in diverse linguistic environments.