Model Overview
The pankajmathur/orca_mini_v3_13b is a 13 billion parameter language model built upon the Llama2 architecture. Developed by Pankaj Mathur, this model has been specifically trained using Orca-style datasets, which emphasizes progressive learning from complex explanation traces. This training methodology aims to enhance the model's ability to follow instructions effectively and provide helpful, detailed responses.
Key Capabilities & Performance
This model demonstrates solid performance across various benchmarks, as evaluated using the EleutherAI Language Model Evaluation Harness and reported on the HuggingFaceH4 Open LLM Leaderboard. Key scores include:
- ARC (25-shot): 63.14
- HellaSwag (10-shot): 82.35
- MMLU (5-shot): 56.52
- TruthfulQA (0-shot): 51.81
With a context length of 4096 tokens, it can handle moderately long prompts and generate comprehensive outputs. Quantized versions (GGML, GPTQ) are also available, making it accessible for deployment on a wider range of hardware.
Good For
- Instruction Following: Excels at understanding and executing user instructions due to its Orca-style training.
- General Conversational AI: Suitable for chatbots and virtual assistants that require coherent and informative dialogue.
- Research and Development: Provides a robust base for further fine-tuning or experimentation with Llama2-based models.
Limitations
Users should be aware that, like all large language models, orca_mini_v3_13b may occasionally produce inaccurate, biased, or offensive content. It is recommended to cross-check critical information and implement appropriate safeguards in applications.