DavidAU/Llama-3.3-8B-Instruct-Thinking-Claude-Haiku-4.5-High-Reasoning-1700x

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jan 5, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

DavidAU/Llama-3.3-8B-Instruct-Thinking-Claude-Haiku-4.5-High-Reasoning-1700x is an 8 billion parameter Llama 3.3-based instruction-tuned model with an extended 128k context window. It has been fine-tuned using a Claude-Haiku-4.5-high-reasoning dataset to enhance its reasoning capabilities, enabling it to "think" and provide short, blunt, and to-the-point reasoning blocks. This model is optimized for tasks requiring high-level reasoning and problem-solving, often activating a thinking process automatically for complex prompts.

Loading preview...

Model Overview

DavidAU/Llama-3.3-8B-Instruct-Thinking-Claude-Haiku-4.5-High-Reasoning-1700x is an 8 billion parameter Llama 3.3-based model, distinguished by its extended 128k context window and specialized instruction tuning. The model was fine-tuned using the Claude-Haiku-4.5-high-reasoning-1700x dataset, which imbues it with a unique "thinking" capability. This allows the model to generate internal reasoning processes, often producing short, blunt, and direct thought blocks before formulating its final response.

Key Capabilities

  • Enhanced Reasoning: Specifically trained to "think" like Claude-Haiku-4.5, providing structured reasoning for complex prompts.
  • Extended Context: Features a 128k context window, suitable for processing and generating longer texts.
  • Automatic Thinking Activation: Certain keywords like "explain," "come up with a plan to...", or "think deeply" automatically trigger the model's reasoning process.
  • Knowledge Update: The fine-tuning process also updated some of the model's core knowledge and root training.

Good For

  • Complex Problem Solving: Ideal for tasks requiring detailed explanations, planning, or deep analytical thought.
  • Creative Writing with Structure: Can generate structured creative content, such as detailed story plots, by first outlining a plan.
  • Technical Explanations: Excels at breaking down and explaining intricate technical concepts.

Usage Notes

  • The model is designed to activate its thinking process automatically for many prompts, but a specific system prompt can be used to force this behavior.
  • Suggested settings include a temperature of 0.7, repetition penalty of 1.05, top_p of 0.95, min_p of 0.05, and top_k of 40. A minimum context window of 4k is recommended, with 8k+ preferred.
  • For smoother operation, users of KoboldCpp, oobabooga/text-generation-webui, or Silly Tavern may benefit from setting a "Smoothing_factor" to 1.5.