Name: DavidAU/Llama3.3-8B-Instruct-Thinking-Claude-4.5-Opus-High-Reasoning API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DavidAU

Model Overview

DavidAU/Llama3.3-8B-Instruct-Thinking-Claude-4.5-Opus-High-Reasoning is an 8 billion parameter language model based on the Llama 3.3 architecture, featuring an extended 128K context window. This model was fine-tuned by DavidAU using Unsloth and a specialized Claude 4.5-Opus High Reasoning dataset, resulting in a unique Instruct/Thinking hybrid. The tuning focused on enhancing its reasoning capabilities rather than updating its core knowledge base.

Key Capabilities

Hybrid Functionality: Operates as both an instruction-following model and a 'thinking' model.
Automatic Reasoning Activation: Certain phrases or words in prompts automatically trigger a detailed internal 'thinking' process, as demonstrated by its ability to break down complex requests like explaining orbital mechanics with mathematical derivations.
Extended Context Window: Supports a 128K token context, allowing for processing and generating longer, more intricate responses.
High Reasoning: Fine-tuned with a dataset specifically designed to impart high reasoning abilities, making it suitable for complex analytical and explanatory tasks.

Good For

Complex Explanations: Ideal for generating detailed, structured explanations, especially in scientific or technical domains requiring mathematical examples and step-by-step reasoning.
Creative Writing with Deep Thought: Can be prompted for creative tasks (e.g., science fiction stories) where it can apply its 'thinking' process to develop intricate plots and themes.
Analytical Tasks: Suitable for use cases demanding a model that can process information deeply and provide well-reasoned outputs.

Usage Notes

Suggested Settings: Recommended parameters include a temperature of 0.7, repetition penalty of 1.05, top_p of 0.95, min_p of 0.05, and top_k of 40.
Context Window: While the minimum context window is 4K, 8K or higher is suggested for optimal performance.
No System Prompt: The model is designed to self-generate 'thinking tags' without an explicit system prompt.
Quantization: Q4KS (non-imatrix) or IQ3_M (imatrix) or higher quantizations are recommended to avoid reasoning or activation issues.

Overview

Model Overview

Key Capabilities

Good For

Usage Notes

Full Model Card (README)