pankajmathur/orca_mini_v3_70b
The orca_mini_v3_70b is a 69 billion parameter Llama2-based language model developed by Pankaj Mathur, fine-tuned on Orca-style datasets. This model is designed for instruction following, leveraging its training to provide helpful and accurate responses. It demonstrates competitive performance across various benchmarks, including ARC, HellaSwag, and MMLU, making it suitable for general-purpose conversational AI and reasoning tasks.
Loading preview...
orca_mini_v3_70b: An Orca-Style Llama2 Model
The orca_mini_v3_70b is a 69 billion parameter language model built upon the Llama2 architecture, developed by Pankaj Mathur. It has been specifically fine-tuned using "Orca Style datasets," which typically involve progressive learning from complex explanation traces, aiming to enhance instruction-following capabilities.
Key Capabilities & Performance
This model excels in instruction adherence, as indicated by its training methodology. Evaluation on the HuggingFaceH4 Open LLM Leaderboard shows strong performance across a range of tasks:
- ARC: 71.25
- HellaSwag: 87.85
- MMLU: 70.18
- TruthfulQA: 61.27
- Winogrande: 82.72
- GSM8K: 40.86
- DROP: 40.17
Its average score on these metrics is 64.9, demonstrating its general proficiency in reasoning, common sense, and knowledge-based tasks. The model also provides specific prompt formatting for optimal interaction, following a ### System:, ### User:, ### Assistant: structure.
Usage Considerations
Users should be aware that the model is bound by the license and usage restrictions of the original Llama-2 model. It requires significant GPU VRAM (up to 45GB in 4-bit quantization) for direct loading, making it suitable for powerful single GPUs or dual consumer GPUs. Like all large language models, it may occasionally produce inaccurate or biased content, necessitating caution and cross-verification of information.