codelion/Llama-3.3-70B-o1

Warm
Public
70B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Llama-3.3-70B-o1 Thinker Model Overview

codelion/Llama-3.3-70B-o1 is a 70 billion parameter language model developed by codelion, fine-tuned from unsloth/llama-3.3-70b-instruct-bnb-4bit. Its primary distinction lies in its specialization for Chain-of-Thought (CoT) style reasoning, designed to explicitly show its thought process.

Key Capabilities & Features

  • CoT Reasoning: Generates detailed 'thinking' traces enclosed within <|begin_of_thought|> and <|end_of_thought|> tags, followed by the final answer in <|begin_of_solution|> and <|end_of_solution|> tags.
  • Enhanced Problem Solving: This explicit reasoning process makes it particularly effective for tasks requiring step-by-step analysis and complex problem-solving.
  • Performance: Achieves a score of 46.7 on the AIME 2024 pass@1 benchmark, outperforming the base Llama-3.3-70B model (30.0) and Sky-T1-32B-Preview (43.3).
  • Training Efficiency: Fine-tuned using QLoRA with Unsloth and Huggingface's TRL library, enabling 2x faster training.
  • Context Length: Supports a substantial context length of 32768 tokens, though users should be prepared for potentially large token generation due to the detailed thought process.

When to Use This Model

  • Complex Analytical Tasks: Ideal for applications where understanding the reasoning steps is as crucial as the final answer.
  • Debugging & Transparency: Useful for scenarios where model transparency and explainability are important, allowing developers to trace how the model arrived at a solution.
  • Benchmarking: When evaluating, ensure max_tokens is set sufficiently high (e.g., 8192) to capture the full output, including the thought trace and solution.