Name: Lyte/Llama-3.1-8B-Instruct-Reasoner-1o1_v0.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Lyte

Overview

Lyte/Llama-3.1-8B-Instruct-Reasoner-1o1_v0.3 is an experimental 8 billion parameter instruction-tuned model developed by Lyte, built upon the Llama-3.1-8B-Instruct architecture. Its primary goal is to explore and improve the model's reasoning process by generating more internal tokens for reflection and self-correction before producing a final response. This approach prioritizes the development of reasoning over achieving state-of-the-art benchmark scores, acknowledging that current benchmarks may not fully capture reasoning abilities.

Key Characteristics

Enhanced Reasoning Focus: Designed to generate additional tokens for internal reasoning, verification, and self-correction.
Base Model: Fine-tuned from unsloth/meta-llama-3.1-8b-instruct-bnb-4bit.
Context Length: Supports a substantial context window of 32,768 tokens.
Performance Impact: While experimental, benchmarks show improvements in arc_challenge (+7.60% acc), arc_easy (+8.88% acc), and commonsense_qa (+3.27% acc) when using the finetuning system prompt. However, some scores like MMLU and GSM-8K show a decrease compared to the original Llama-3.1-8B-Instruct, indicating the experimental nature and different optimization target.
Training Efficiency: Trained 2x faster using Unsloth and Huggingface's TRL library.

When to Use This Model

This model is particularly suited for:

Research into Reasoning: Ideal for developers and researchers interested in exploring and improving AI's reasoning capabilities and self-correction mechanisms.
Applications Requiring Deliberation: Use cases where the model's ability to "think through" a problem and potentially correct itself is more valuable than raw speed or benchmark-optimized performance.
System Prompt Integration: Best utilized with its specific system prompt to leverage its reasoning-focused finetuning.

Overview

Overview

Key Characteristics

When to Use This Model

Full Model Card (README)