Cogito v2-preview-llama-70B: Hybrid Reasoning LLM
The deepcogito/cogito-v2-preview-llama-70B is a 70 billion parameter instruction-tuned generative language model from DeepCogito, designed for advanced reasoning and general-purpose applications. This model stands out as a hybrid reasoning model, capable of operating in a standard LLM mode or an enhanced self-reflection mode, where it 'thinks' before generating a response. This capability is enabled by its training using Iterated Distillation and Amplification (IDA), an alignment strategy focused on iterative self-improvement.
Key Capabilities & Optimizations
- Hybrid Reasoning: Seamlessly switches between direct response and a self-reflective 'thinking' mode for improved accuracy and coherence.
- Enhanced Performance: Outperforms size-equivalent models on common industry benchmarks in both standard and reasoning modes.
- Multilingual Support: Trained in over 30 languages, offering strong multilingual capabilities.
- Specialized Strengths: Optimized for:
- Coding tasks
- STEM (Science, Technology, Engineering, Mathematics) problems
- Complex instruction following
- General helpfulness
- Tool Calling: Supports single, parallel, and multiple tool calls in both standard and extended thinking modes, facilitating integration with external functions.
- Extended Context: Features a substantial context length of 128k tokens.
Usage & Differentiation
Developers can enable the model's extended thinking mode by setting enable_thinking=True in the tokenizer's chat template or by including a specific system prompt and prefilling the response with <think>\n. This unique hybrid approach allows for more robust and reliable outputs, particularly for tasks requiring deeper analysis or problem-solving. The model's strong performance across coding, STEM, and multilingual benchmarks, combined with its advanced reasoning and tool-calling features, positions it as a versatile choice for a wide range of demanding applications.