Cogito v1 Preview - 32B: Hybrid Reasoning LLM
The deepcogito/cogito-v1-preview-qwen-32B model is a 32.8 billion parameter instruction-tuned generative language model developed by DeepCogito. It is designed as a hybrid reasoning model, capable of operating in two distinct modes: a standard direct answering mode and a self-reflection mode that enables deeper reasoning.
Key Capabilities & Features
- Hybrid Reasoning: Can answer directly or engage in self-reflection for complex tasks, enhancing performance on reasoning-intensive benchmarks.
- IDA Training: Utilizes Iterated Distillation and Amplification (IDA), an alignment strategy for iterative self-improvement.
- Optimized Performance: Specifically optimized for coding, STEM subjects, instruction following, and general helpfulness.
- Enhanced Multilingual & Coding: Offers significantly stronger multilingual support, coding capabilities, and tool-calling functionality compared to other models of similar size.
- Extensive Context Window: Supports a context length of 128,000 tokens.
- Tool Calling: Fully supports single, parallel, multiple, and parallel-multiple tool calls in both standard and extended thinking modes.
Differentiated Performance
Cogito v1-preview models consistently outperform size-equivalent counterparts on common industry benchmarks in both direct and reasoning modes. This includes comparisons against Llama, Qwen, Deepseek's R1, and Qwen's QwQ models, as detailed in the Blog Post.
Enabling Extended Thinking
Users can activate the model's deep thinking subroutine either by including a specific system prompt ('Enable deep thinking subroutine.') or by setting enable_thinking=True when applying the chat template via the Hugging Face tokenizer.