unsloth/Llama-3.1-8B

Warm
Public
8B
FP8
32768
1
Feb 15, 2025
License: llama3.1
Hugging Face
Overview

Overview

unsloth/Llama-3.1-8B is an 8 billion parameter instruction-tuned model from Meta's Llama 3.1 collection, optimized by Unsloth for efficient fine-tuning. It is built on an auto-regressive transformer architecture, utilizing Grouped-Query Attention (GQA) for improved inference scalability. The model supports a substantial 128K context length and is designed for multilingual dialogue use cases, outperforming many open-source and closed chat models on common industry benchmarks.

Key Capabilities

  • Multilingual Support: Optimized for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for other languages through fine-tuning.
  • Instruction Following: Fine-tuned using Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) for alignment with human preferences.
  • Tool Use: Supports multiple tool use formats and integrates with Transformers chat templates for function calling.
  • Efficient Fine-tuning: Unsloth's optimizations enable 2.4x faster fine-tuning and 58% less memory usage compared to standard methods.
  • Performance: Demonstrates strong performance across various benchmarks, including MMLU (69.4%), HumanEval (72.6% pass@1), and MATH (51.9% final_em).

Good For

  • Developing assistant-like chat applications requiring multilingual capabilities.
  • Natural language generation tasks where a large context window is beneficial.
  • Researchers and developers looking to fine-tune Llama 3.1 models efficiently with reduced computational resources.
  • Applications requiring robust instruction following and tool integration.