allura-forge/Llama-3.3-8B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Dec 30, 2025License:llama3.3Architecture:Transformer0.2K Warm

allura-forge/Llama-3.3-8B-Instruct is an 8 billion parameter instruction-tuned causal language model, identified as a version of Meta's Llama 3.3. This model was extracted from Meta's Llama API and shows improved performance over Llama 3.1 8B Instruct on benchmarks like IFEval and GPQA Diamond. It is suitable for general instruction-following tasks, with a variant available that extends its context length to 128k tokens.

Loading preview...

Llama 3.3 8B Instruct Overview

This model, allura-forge/Llama-3.3-8B-Instruct, is presented as an official 8 billion parameter instruction-tuned version of Meta's Llama 3.3. It was uniquely obtained by extracting the base model from Meta's Llama API's finetuning service, which initially offered a Llama 3.3 8B variant not publicly available elsewhere.

Key Capabilities & Performance

  • Improved Instruction Following: Benchmarks indicate superior performance compared to Llama 3.1 8B Instruct.
    • IFEval (instruction following): 81.95 (vs 78.2 for Llama 3.1 8B Instruct)
    • GPQA Diamond: 37.0 (vs 29.3 for Llama 3.1 8B Instruct)
  • Context Length: The original downloaded version has an 8k context length, but a variant configured for 128k context (using Llama 3.3 70B's RoPE config) shows slightly better benchmark results.
  • Authenticity: Despite some internal metadata pointing to Llama 3, the model exhibits distinct stylistic and knowledge differences from Llama 3 and 3.1, aligning with the Llama API's Llama 3.3 8B offering.

Good For

  • Developers seeking an instruction-tuned model with improved performance over Llama 3.1 8B Instruct.
  • Experimentation with a potentially official, yet uniquely sourced, Llama 3.3 8B model.
  • General instruction-following applications where the enhanced benchmark scores are beneficial.