pankajmathur/orca_mini_v3_13b

Warm
Public
13B
FP8
4096
License: other
Hugging Face
Overview

Model Overview

The pankajmathur/orca_mini_v3_13b is a 13 billion parameter language model built upon the Llama2 architecture. Developed by Pankaj Mathur, this model has been specifically trained using Orca-style datasets, which emphasizes progressive learning from complex explanation traces. This training methodology aims to enhance the model's ability to follow instructions effectively and provide helpful, detailed responses.

Key Capabilities & Performance

This model demonstrates solid performance across various benchmarks, as evaluated using the EleutherAI Language Model Evaluation Harness and reported on the HuggingFaceH4 Open LLM Leaderboard. Key scores include:

  • ARC (25-shot): 63.14
  • HellaSwag (10-shot): 82.35
  • MMLU (5-shot): 56.52
  • TruthfulQA (0-shot): 51.81

With a context length of 4096 tokens, it can handle moderately long prompts and generate comprehensive outputs. Quantized versions (GGML, GPTQ) are also available, making it accessible for deployment on a wider range of hardware.

Good For

  • Instruction Following: Excels at understanding and executing user instructions due to its Orca-style training.
  • General Conversational AI: Suitable for chatbots and virtual assistants that require coherent and informative dialogue.
  • Research and Development: Provides a robust base for further fine-tuning or experimentation with Llama2-based models.

Limitations

Users should be aware that, like all large language models, orca_mini_v3_13b may occasionally produce inaccurate, biased, or offensive content. It is recommended to cross-check critical information and implement appropriate safeguards in applications.