TinyPixel/elm-test
TinyPixel/elm-test is a 7 billion parameter language model evaluated on the Open LLM Leaderboard. This model provides a baseline performance across various benchmarks, including ARC, HellaSwag, MMLU, and TruthfulQA, with an average score of 43.74. It is suitable for general language understanding and generation tasks where a moderate performance profile is acceptable.
Loading preview...
TinyPixel/elm-test Model Summary
This model, TinyPixel/elm-test, is a 7 billion parameter language model primarily evaluated on the Hugging Face Open LLM Leaderboard. It serves as a benchmark entry, showcasing its performance across a suite of common language understanding and reasoning tasks.
Key Evaluation Metrics
The model's performance is characterized by the following scores from the Open LLM Leaderboard evaluation:
- Avg. Score: 43.74
- ARC (25-shot): 53.16
- HellaSwag (10-shot): 78.98
- MMLU (5-shot): 47.04
- TruthfulQA (0-shot): 39.51
- Winogrande (5-shot): 74.35
- GSM8K (5-shot): 7.51
- DROP (3-shot): 5.65
These metrics indicate its capabilities in areas such as common sense reasoning, multiple-choice question answering, factual recall, and reading comprehension. The model's context length is 4096 tokens.
Intended Use Cases
Given its evaluation profile, TinyPixel/elm-test is suitable for:
- General-purpose language tasks: Where a foundational understanding of language is required.
- Benchmarking and research: As a reference point for comparing against other models in its parameter class.
- Exploratory development: For applications that can tolerate a moderate level of performance across diverse tasks.