The recursal/RWKV6QwQ-32B-final-250307 is a 32 billion parameter RWKV-variant language model developed by recursal, based on the Qwen 2.5 QwQ 32B architecture. This model leverages linear attention to significantly reduce computational costs and improve inference efficiency, particularly for long context lengths. It demonstrates competitive performance across various benchmarks, including ARC Challenge and Winogrande, making it suitable for general language understanding and generation tasks where cost-effective inference is critical.
Loading preview...
Model Overview
The recursal/RWKV6QwQ-32B-final-250307 is a 32 billion parameter language model developed by recursal, utilizing a RWKV-variant architecture. This model is derived from the Qwen 2.5 QwQ 32B, showcasing a successful conversion to a more efficient linear attention mechanism without requiring a full pre-training or retraining from scratch. This approach aims to drastically reduce computational costs and enable faster inference, especially for applications requiring large context lengths.
Key Capabilities & Performance
This model inherits its core knowledge and dataset training from its Qwen parent, supporting approximately 30 languages. It demonstrates strong performance across several benchmarks, often outperforming its base Qwen counterpart in specific tasks:
- ARC Challenge (acc_norm): Achieves 0.5640, slightly surpassing Qwen/QwQ-32B's 0.5563.
- Winogrande (acc): Scores 0.7324, outperforming Qwen/QwQ-32B's 0.7048.
- SCIQ (acc): Matches Qwen/QwQ-32B with 0.9630.
Unique Differentiator
The primary innovation of this model lies in its linear attention mechanism, which offers a >1000x improvement in inference costs compared to traditional transformer architectures. This makes it highly efficient for scenarios demanding cost-effective and scalable AI solutions, particularly for long-context processing. The model's development process, detailed in the paper RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale, highlights a method for converting existing large models to more efficient RWKV variants.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.