pankajmathur/model_009
pankajmathur/model_009 is a 69 billion parameter Llama2-based causal language model developed by Pankaj Mathur, fine-tuned on Orca-style datasets. This model is designed for instruction following, demonstrating a total average performance of 65.03 on the HuggingFaceH4 Open LLM Leaderboard benchmarks. It is optimized for general instruction-following tasks, offering a context length of 32768 tokens.
Loading preview...
Overview
pankajmathur/model_009 is a 69 billion parameter Llama2-based model developed by Pankaj Mathur, specifically fine-tuned using Orca-style datasets. This approach aims to enhance the model's instruction-following capabilities, making it proficient in responding to a wide range of user prompts.
Performance & Evaluation
The model has been evaluated using the EleutherAI Language Model Evaluation Harness, with results aligned with the HuggingFaceH4 Open LLM Leaderboard. It achieved a total average score of 65.03 across various benchmarks. Key scores include:
- ARC: 71.59
- HellaSwag: 87.71
- MMLU: 69.43
- TruthfulQA: 60.72
- Winogrande: 82.32
- GSM8k: 39.42
- DROP: 44.01
Usage & System Requirements
This model requires significant GPU VRAM, specifically up to 45GB in 4-bit quantization. It can be loaded on a single high-end GPU (e.g., RTX 6000, L40, A40, A100, H100) or dual consumer-grade GPUs (e.g., RTX 4090, L4, A10, RTX 3090, RTX A5000). It supports a prompt format with explicit ### System:, ### User:, and ### Assistant: sections. Example code for integration with transformers library is provided, demonstrating 4-bit loading and text generation.
Limitations
While designed for accuracy, the model may occasionally produce inaccurate or misleading information. There is also a possibility of generating inappropriate, biased, or offensive content, despite efforts in refining training data. Users are advised to exercise caution and cross-verify generated information.