Overview
Llama-3.3-70B-o1 Thinker Model Overview
codelion/Llama-3.3-70B-o1 is a 70 billion parameter language model developed by codelion, fine-tuned from unsloth/llama-3.3-70b-instruct-bnb-4bit. Its primary distinction lies in its specialization for Chain-of-Thought (CoT) style reasoning, designed to explicitly show its thought process.
Key Capabilities & Features
- CoT Reasoning: Generates detailed 'thinking' traces enclosed within
<|begin_of_thought|>and<|end_of_thought|>tags, followed by the final answer in<|begin_of_solution|>and<|end_of_solution|>tags. - Enhanced Problem Solving: This explicit reasoning process makes it particularly effective for tasks requiring step-by-step analysis and complex problem-solving.
- Performance: Achieves a score of 46.7 on the AIME 2024 pass@1 benchmark, outperforming the base Llama-3.3-70B model (30.0) and Sky-T1-32B-Preview (43.3).
- Training Efficiency: Fine-tuned using QLoRA with Unsloth and Huggingface's TRL library, enabling 2x faster training.
- Context Length: Supports a substantial context length of 32768 tokens, though users should be prepared for potentially large token generation due to the detailed thought process.
When to Use This Model
- Complex Analytical Tasks: Ideal for applications where understanding the reasoning steps is as crucial as the final answer.
- Debugging & Transparency: Useful for scenarios where model transparency and explainability are important, allowing developers to trace how the model arrived at a solution.
- Benchmarking: When evaluating, ensure
max_tokensis set sufficiently high (e.g., 8192) to capture the full output, including the thought trace and solution.