dhmeltzer/Llama-2-13b-hf-ds_wiki_1024_full_r_64_alpha_16_merged
The dhmeltzer/Llama-2-13b-hf-ds_wiki_1024_full_r_64_alpha_16_merged model is a 13 billion parameter language model based on the Llama 2 architecture, fine-tuned for general language understanding. It features a 4096-token context length and demonstrates a balanced performance across various benchmarks, including an average score of 46.33 on the Open LLM Leaderboard. This model is suitable for a range of natural language processing tasks requiring robust comprehension and generation capabilities.
Loading preview...
Model Overview
The dhmeltzer/Llama-2-13b-hf-ds_wiki_1024_full_r_64_alpha_16_merged is a 13 billion parameter language model built upon the Llama 2 architecture. It has been fine-tuned with a focus on general language understanding, evidenced by its performance across a suite of benchmarks. The model supports a context length of 4096 tokens, allowing for processing moderately long inputs.
Key Capabilities & Performance
This model's performance is evaluated on the Open LLM Leaderboard, achieving an average score of 46.33. Specific benchmark results include:
- ARC (25-shot): 58.45
- HellaSwag (10-shot): 81.97
- MMLU (5-shot): 55.02
- TruthfulQA (0-shot): 35.85
- Winogrande (5-shot): 75.69
- GSM8K (5-shot): 10.69
- DROP (3-shot): 6.63
These scores indicate a solid foundation in common sense reasoning, reading comprehension, and general knowledge tasks, while showing areas for improvement in complex mathematical reasoning (GSM8K) and factual accuracy (TruthfulQA).
Good For
- General-purpose text generation and understanding tasks.
- Applications requiring a balance of reasoning and common sense.
- Use cases where a 13B parameter model with a 4096-token context is sufficient.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.