Name: itsliupeng/llama2_70b_mmlu API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: itsliupeng

Overview

The itsliupeng/llama2_70b_mmlu model is a 69 billion parameter language model derived from the Llama-2-70b-hf architecture. Developed by itsliupeng, this model undergoes continuous training using the mmlu_recall dataset. The primary objective of this training regimen is to significantly improve its performance on Multi-task Language Understanding (MMLU) benchmarks, without negatively impacting its capabilities across other evaluation metrics.

Key Capabilities & Performance

This model demonstrates strong performance across various benchmarks, as indicated by its evaluation on the Open LLM Leaderboard. Its specific optimization for MMLU tasks makes it particularly adept at complex reasoning and understanding.

Average Score: Achieves an average score of 68.24 on the Open LLM Leaderboard.
MMLU (5-Shot): Scores 71.89, highlighting its enhanced multi-task language understanding.
AI2 Reasoning Challenge (25-Shot): Performs well with a score of 65.61.
HellaSwag (10-Shot): Demonstrates strong common-sense reasoning with 87.37.
Winogrande (5-shot): Achieves 82.40, indicating proficiency in resolving pronoun ambiguity.
GSM8k (5-shot): Scores 52.99 on mathematical reasoning tasks.

When to Use This Model

This model is particularly well-suited for applications requiring robust language understanding and reasoning, especially where MMLU performance is a critical factor. Its continuous training on the mmlu_recall dataset makes it a strong candidate for tasks that benefit from improved accuracy in diverse knowledge domains and problem-solving scenarios.

Overview

Overview

Key Capabilities & Performance

When to Use This Model

Full Model Card (README)