dhmeltzer/Llama-2-7b-hf-eli5-cleaned-1024_qlora_merged

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 11, 2023Architecture:Transformer Cold

dhmeltzer/Llama-2-7b-hf-eli5-cleaned-1024_qlora_merged is a 7 billion parameter Llama-2-based model developed by dhmeltzer. This model is fine-tuned with QLoRA and merged, demonstrating an average performance of 44.13 on the Open LLM Leaderboard benchmarks. It is suitable for general language understanding tasks, particularly those requiring a balance of reasoning and common sense.

Loading preview...

Model Overview

dhmeltzer/Llama-2-7b-hf-eli5-cleaned-1024_qlora_merged is a 7 billion parameter language model built upon the Llama-2 architecture. It has been fine-tuned using the QLoRA method and subsequently merged, indicating an optimization for efficient deployment while retaining performance.

Key Capabilities & Performance

This model's performance is evaluated on the Open LLM Leaderboard, achieving an average score of 44.13. Specific benchmark results include:

  • ARC (25-shot): 53.67
  • HellaSwag (10-shot): 78.21
  • MMLU (5-shot): 45.9
  • TruthfulQA (0-shot): 46.13
  • Winogrande (5-shot): 73.8
  • GSM8K (5-shot): 4.7
  • DROP (3-shot): 6.53

These scores suggest a balanced capability across various tasks, including common sense reasoning, general knowledge, and question answering, with stronger performance in HellaSwag and Winogrande.

Use Cases

This model is well-suited for applications requiring a moderately sized language model with general understanding capabilities. Its performance profile indicates potential for tasks such as:

  • Text generation and summarization
  • Question answering based on common knowledge
  • Reasoning tasks where high accuracy on complex mathematical problems (like GSM8K) is not the primary requirement.