dhmeltzer/llama-7b-SFT_ds_eli5_1024_r_64_alpha_16_merged

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Aug 25, 2023Architecture:Transformer Cold

The dhmeltzer/llama-7b-SFT_ds_eli5_1024_r_64_alpha_16_merged model is a 7 billion parameter Llama-based language model, fine-tuned for instruction following. It achieves an average score of 43.25 on the Open LLM Leaderboard, demonstrating capabilities across various reasoning and common sense tasks. This model is suitable for general-purpose natural language understanding and generation applications where a 7B parameter model is appropriate.

Loading preview...

Model Overview

The dhmeltzer/llama-7b-SFT_ds_eli5_1024_r_64_alpha_16_merged is a 7 billion parameter language model built on the Llama architecture. It has been fine-tuned for instruction following, making it capable of understanding and responding to a variety of prompts. The model's performance has been evaluated on the Open LLM Leaderboard, where it achieved an average score of 43.25.

Key Performance Metrics

Evaluations on the Open LLM Leaderboard highlight its capabilities across several benchmarks:

  • ARC (25-shot): 53.41
  • HellaSwag (10-shot): 77.9
  • MMLU (5-shot): 43.56
  • TruthfulQA (0-shot): 40.81
  • Winogrande (5-shot): 74.59
  • GSM8K (5-shot): 5.08
  • DROP (3-shot): 7.37

These scores indicate its proficiency in common sense reasoning, reading comprehension, and general knowledge tasks, while also showing areas for improvement in complex mathematical reasoning (GSM8K) and deep reading comprehension (DROP).

Use Cases

This model is well-suited for applications requiring a balance of performance and computational efficiency, such as:

  • General text generation and summarization
  • Instruction-following chatbots
  • Question answering based on common knowledge
  • Prototyping and development where a 7B parameter model is a good fit.