Model Overview
The pankajmathur/orca_mini_v9_3_70B is a 70 billion parameter instruction-tuned model developed by pankajmathur. It is built upon the Llama-3.3-70B-Instruct base model and has been fine-tuned using a variety of Supervised Fine-Tuning (SFT) datasets. This model is designed to be a comprehensive general-purpose AI assistant, offering a substantial 32768 token context window.
Key Characteristics
- Base Model: Utilizes the robust Llama-3.3-70B-Instruct architecture.
- Parameter Count: Features 70 billion parameters, providing strong generative capabilities.
- Context Length: Supports a 32768 token context window, enabling processing of longer inputs and generating more coherent, extended responses.
- Customization Ready: Explicitly designed as a foundational model, encouraging developers to perform further fine-tuning (DPO, PPO, ORPO), or model merges to tailor it for specific applications.
Usage and Deployment
The model supports standard text generation pipelines and can be deployed in various precision formats to manage VRAM requirements. It requires approximately 133GB VRAM in default half precision (bfloat16), but can be run with significantly less VRAM using 4-bit (39GB) or 8-bit (69GB) quantization via the bitsandbytes library. The model adheres to the Llama 3 prompt format, ensuring compatibility with existing Llama 3-based workflows.
Responsible AI Considerations
As the base model is Llama 3.3, it incorporates Meta's responsible AI practices, including safety fine-tuning, data quality control, and mitigation of critical risks such as CBRNE, child safety, and cyber attack enablement. Developers are encouraged to implement additional safeguards and refer to Meta's Responsible Use Guide for safe deployment.