pankajmathur/model_51

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Aug 3, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

model_51 by Pankaj Mathur is a 69 billion parameter Llama2-based causal language model fine-tuned on Orca-style datasets, designed for instruction following. It achieves a total average score of 64.88 on the Open LLM Leaderboard benchmarks, including 69.31 on MMLU and 86.71 on HellaSwag. This model is optimized for general instruction-following tasks and can be deployed on systems with at least 45GB GPU VRAM.

Loading preview...

Overview

pankajmathur/model_51 is a 69 billion parameter language model built upon the Llama2 architecture. It has been specifically fine-tuned using Orca-style datasets, which typically involve learning from complex explanation traces, aiming to enhance its instruction-following capabilities. The model supports a context length of 32768 tokens.

Key Capabilities & Performance

Evaluated against the HuggingFaceH4 Open LLM Leaderboard metrics, model_51 demonstrates solid performance across various tasks:

  • Total Average Score: 64.88
  • MMLU (5-shot): 69.31
  • HellaSwag (10-shot): 86.71
  • ARC (25-shot): 68.43
  • TruthfulQA (0-shot): 57.18
  • Winogrande (5-shot): 81.77
  • GSM8K (5-shot): 32.37

Usage and Limitations

The model requires significant GPU resources, specifically up to 45GB of VRAM for 4-bit loading, making it suitable for single high-end GPUs or dual consumer GPUs. It follows a specific prompt format for optimal instruction adherence. While designed for accuracy, users should be aware of potential inaccuracies, biases, or inappropriate content generation, and exercise caution by cross-checking information.

Good for

  • General instruction-following applications.
  • Tasks requiring a Llama2-based model with Orca-style fine-tuning.
  • Developers with access to substantial GPU memory for deployment.