kehanlu/llama-3.2-8B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 7, 2025Architecture:Transformer0.0K Warm

kehanlu/llama-3.2-8B-Instruct is an 8 billion parameter instruction-tuned causal language model derived from Meta's Llama-3.2-11B-Vision-Instruct. This model has been specifically re-engineered to be a text-only variant, removing the cross-attention layers associated with vision capabilities. It offers a robust foundation for general-purpose text generation and instruction following, maintaining the core linguistic strengths of the Llama 3.2 series with a 32768 token context length.

Loading preview...

Overview

kehanlu/llama-3.2-8B-Instruct is an 8 billion parameter, instruction-tuned language model, specifically designed for text-only applications. It is a distilled version of meta-llama/Llama-3.2-11B-Vision-Instruct, with the vision-related cross-attention layers removed to create a purely linguistic model.

Key Characteristics

  • Text-Only Focus: This model has been meticulously re-engineered from a larger multimodal model to concentrate solely on text processing, making it efficient for language-based tasks.
  • Architecture: Derived from the Llama 3.2 series, it retains the strong instruction-following capabilities and general language understanding of its lineage.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling it to handle longer prompts and generate more coherent, extended responses.
  • Tokenizer Modification: The tokenizer.chat_template has been modified in this repository to remove the date_string behavior, which appended the current date during template application in the original model.

Use Cases

This model is well-suited for a wide range of natural language processing tasks where a powerful, instruction-tuned text model with a large context window is beneficial. Potential applications include advanced chatbots, content generation, summarization, and complex instruction following.