andrijdavid/macaroni-7b
Macaroni-7B is an experimental 7 billion parameter language model developed by andrijdavid, created through a merge of pre-trained Mistral models with fblgit/UNA-TheBeagle-7b-v1. This model demonstrates strong general reasoning capabilities across various benchmarks, achieving an average score of 74.60 on the Open LLM Leaderboard. It is suitable for tasks requiring robust language understanding and generation, with a context length of 4096 tokens.
Loading preview...
Macaroni-7B Overview
Macaroni-7B is a 7 billion parameter experimental language model developed by andrijdavid. It is constructed by merging pre-trained Mistral language models with fblgit/UNA-TheBeagle-7b-v1, aiming to combine their respective strengths. The model has a context length of 4096 tokens, making it suitable for processing moderately long inputs.
Key Capabilities & Performance
Evaluated on the Open LLM Leaderboard, Macaroni-7B demonstrates solid performance across a range of benchmarks, achieving an average score of 74.60. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 73.12
- HellaSwag (10-Shot): 88.17
- MMLU (5-Shot): 64.58
- TruthfulQA (0-shot): 68.76
- Winogrande (5-shot): 84.37
- GSM8k (5-shot): 68.61
These scores indicate its proficiency in reasoning, common sense, and general knowledge tasks.
When to Use Macaroni-7B
This model is a good candidate for developers and researchers looking for a 7B parameter model with a balanced performance profile across various general language understanding and generation tasks. Its experimental nature suggests it could be a base for further fine-tuning or research into model merging techniques. Users should consider its benchmark performance for applications requiring robust reasoning and factual recall.