Model Overview
pankajmathur/orca_mini_v8_0_70b is a 70 billion parameter instruction-tuned model based on the Llama-3.3-70B-Instruct architecture. It has been trained using diverse Supervised Fine-Tuning (SFT) datasets, aiming to be a versatile general-purpose model. A key feature is its support for a substantial 32768 token context window.
Key Capabilities & Features
- Foundation Model: Designed to be a robust base for subsequent fine-tuning (e.g., DPO, PPO, ORPO) and model merging.
- Llama 3.3 Prompt Format: Utilizes the Llama 3.3 prompt format, including system, user, and assistant roles.
- Tool Use: Supports advanced tool use and function calling, compatible with Transformers chat templating, allowing integration with external functions.
- Quantization Support: Efficiently deployable in various quantization formats, including 4-bit (approx. 39GB VRAM) and 8-bit (approx. 69GB VRAM) using
bitsandbytesfor reduced memory footprint. - Multilingual Support: While primarily English, Llama 3.3, its base, supports 7 additional languages (French, German, Hindi, Italian, Portuguese, Spanish, Thai), though developers are advised to implement further fine-tuning and system controls for non-supported languages.
Responsible AI & Safety
Meta's Llama 3.3, the base model, emphasizes responsible deployment with a three-pronged strategy focusing on developer enablement, protection against adversarial use, and community safeguards. It includes safety fine-tuning, refusal handling, and tone guidelines. Developers are encouraged to integrate system safeguards like Llama Guard 3, Prompt Guard, and Code Shield for robust AI systems. Evaluations cover common use cases, specific capabilities, and include extensive red teaming against critical risks such as CBRNE, child safety, and cyber attack enablement.