Name: codelion/Llama-3.3-70B-o1 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: codelion

Llama-3.3-70B-o1 Thinker Model Overview

codelion/Llama-3.3-70B-o1 is a 70 billion parameter language model developed by codelion, fine-tuned from unsloth/llama-3.3-70b-instruct-bnb-4bit. Its primary distinction lies in its specialization for Chain-of-Thought (CoT) style reasoning, designed to explicitly show its thought process.

Key Capabilities & Features

CoT Reasoning: Generates detailed 'thinking' traces enclosed within <|begin_of_thought|> and <|end_of_thought|> tags, followed by the final answer in <|begin_of_solution|> and <|end_of_solution|> tags.
Enhanced Problem Solving: This explicit reasoning process makes it particularly effective for tasks requiring step-by-step analysis and complex problem-solving.
Performance: Achieves a score of 46.7 on the AIME 2024 pass@1 benchmark, outperforming the base Llama-3.3-70B model (30.0) and Sky-T1-32B-Preview (43.3).
Training Efficiency: Fine-tuned using QLoRA with Unsloth and Huggingface's TRL library, enabling 2x faster training.
Context Length: Supports a substantial context length of 32768 tokens, though users should be prepared for potentially large token generation due to the detailed thought process.

When to Use This Model

Complex Analytical Tasks: Ideal for applications where understanding the reasoning steps is as crucial as the final answer.
Debugging & Transparency: Useful for scenarios where model transparency and explainability are important, allowing developers to trace how the model arrived at a solution.
Benchmarking: When evaluating, ensure max_tokens is set sufficiently high (e.g., 8192) to capture the full output, including the thought trace and solution.

Overview

Llama-3.3-70B-o1 Thinker Model Overview

Key Capabilities & Features

When to Use This Model

Full Model Card (README)