Model Overview
pybbb/Llama-3.1-8B-Instruct-anti-dpo-sizhe is an 8 billion parameter instruction-tuned model from Meta's Llama 3.1 collection, released on July 23, 2024. This model is built on an optimized transformer architecture, utilizing Grouped-Query Attention (GQA) for enhanced inference scalability. It was trained on over 15 trillion tokens of publicly available data, with a knowledge cutoff of December 2023, and fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Key Capabilities
- Multilingual Support: Optimized for dialogue in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for fine-tuning in other languages.
- Extended Context Window: Features a substantial 128k token context length, enabling processing of longer inputs and generating more coherent, extended responses.
- Enhanced Performance: Demonstrates improved scores over Llama 3 8B Instruct on various benchmarks, including MMLU (69.4%), HumanEval (72.6% pass@1), and MATH (51.9% final_em).
- Tool Use: Supports multiple tool use formats, with significant improvements in API-Bank (82.6%) and BFCL (76.1%) benchmarks, facilitating integration into agentic systems.
- Code Generation: Excels in code-related tasks, as evidenced by its HumanEval pass@1 score of 72.6%.
Intended Use Cases
This model is designed for commercial and research applications, particularly for assistant-like chat and natural language generation tasks. Its capabilities make it suitable for:
- Multilingual Chatbots: Creating interactive agents that can converse effectively across supported languages.
- Code Assistants: Aiding developers in code generation and understanding.
- Tool-Augmented Systems: Building sophisticated AI applications that leverage external tools and APIs.
- Synthetic Data Generation: Generating data to improve other models.
Developers are encouraged to implement additional safety guardrails and refer to Meta's Responsible Use Guide for safe deployment.