MuXodious/Qwen3.5-4B-ARA-heresy-v2
MuXodious/Qwen3.5-4B-ARA-heresy-v2 is a 4.5 billion parameter Qwen3.5 fine-tune, developed by MuXodious, utilizing P-E-W's Heretic ablation engine with ARA (Ablation via Reinforcement Alignment) for model modification. This multimodal model features a 32768-token context length and is optimized for reduced refusals and improved real-world adaptability, making it suitable for agentic applications and complex reasoning tasks.
Loading preview...
Model Overview
MuXodious/Qwen3.5-4B-ARA-heresy-v2 is a 4.5 billion parameter fine-tuned version of the Qwen3.5 model, developed by MuXodious. It leverages P-E-W's Heretic ablation engine, specifically using the ARA (Ablation via Reinforcement Alignment) method, to modify model behavior. This version is noted for significantly reducing refusals, achieving 6 refusals out of 104 initial refusals, and maintaining a low KL Divergence of 0.0708.
Key Capabilities
- Multimodal Learning: Features a unified vision-language foundation with early fusion training, excelling in reasoning, coding, agent tasks, and visual understanding.
- Efficient Hybrid Architecture: Incorporates Gated Delta Networks and sparse Mixture-of-Experts for high-throughput inference.
- Scalable RL Generalization: Achieves robust real-world adaptability through reinforcement learning scaled across million-agent environments.
- Global Linguistic Coverage: Supports 201 languages and dialects, enabling inclusive deployment.
- Extended Context Length: Natively handles up to 262,144 tokens, extensible to 1,010,000 tokens using YaRN scaling.
Good for
- Agentic Applications: Optimized for tool calling, with recommended use cases in Qwen-Agent and Qwen Code for terminal-based AI agents.
- Complex Reasoning: Demonstrates strong performance in knowledge, STEM, and reasoning benchmarks, including MMLU-Pro (79.1%) and C-Eval (85.1%).
- Multimodal Tasks: Excels in vision-language tasks such as STEM and puzzle-solving (MMMU 77.6%), general VQA, text recognition, and spatial intelligence.
- Long Context Processing: Ideal for applications requiring analysis of ultra-long texts and videos, with support for high frame-rate video sampling.