NekoPunchBBB/Llama-2-13b-hf_Open-Platypus-8bit-att
NekoPunchBBB/Llama-2-13b-hf_Open-Platypus-8bit-att is a 13 billion parameter language model based on the Llama 2 architecture, fine-tuned with the Open-Platypus dataset. It demonstrates a generalist capability across various benchmarks, achieving an average score of 46.97 on the Open LLM Leaderboard. With a context length of 4096 tokens, this model is suitable for a range of common natural language processing tasks.
Loading preview...
Model Overview
NekoPunchBBB/Llama-2-13b-hf_Open-Platypus-8bit-att is a 13 billion parameter large language model built upon the Llama 2 foundation. It has been further fine-tuned using the Open-Platypus dataset, aiming to enhance its general performance across diverse tasks. The model operates with a context window of 4096 tokens.
Performance Highlights
Evaluated on the Hugging Face Open LLM Leaderboard, this model achieved an overall average score of 46.97. Key benchmark results include:
- ARC (25-shot): 57.51
- HellaSwag (10-shot): 82.14
- MMLU (5-shot): 54.56
- TruthfulQA (0-shot): 42.21
- Winogrande (5-shot): 76.56
While demonstrating solid performance in areas like HellaSwag and Winogrande, its scores on more complex reasoning tasks such as GSM8K (9.55) and DROP (6.26) indicate potential areas for further specialization or improvement.
Use Cases
This model is a suitable candidate for applications requiring a general-purpose language understanding and generation, particularly where a 13 billion parameter model fits computational constraints. Its balanced performance across various benchmarks suggests utility in tasks like text summarization, question answering, and content generation, especially when fine-tuned for specific domain knowledge.