andrijdavid/Macaroni-7b-Tied

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 19, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Macaroni-7b-Tied is a 7 billion parameter language model developed by andrijdavid, built upon the Mistral-7B-v0.1 base using the TIES merge method. This model integrates capabilities from four distinct 7B models, aiming for a balanced performance across various benchmarks. It achieves an average score of 74.96 on the Open LLM Leaderboard, demonstrating proficiency in reasoning, common sense, and language understanding tasks.

Loading preview...

Macaroni-7b-Tied: A Merged 7B Language Model

Macaroni-7b-Tied is a 7 billion parameter language model created by andrijdavid, leveraging the Mistral-7B-v0.1 as its foundational base. This model was developed using the TIES merge method, which combines the strengths of multiple specialized models into a single, more versatile entity.

Key Characteristics and Performance

This model integrates contributions from four distinct 7B models:

Evaluated on the Open LLM Leaderboard, Macaroni-7b-Tied achieved an average score of 74.96. Notable benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 72.87
  • HellaSwag (10-Shot): 88.14
  • MMLU (5-Shot): 64.73
  • TruthfulQA (0-shot): 70.54
  • GSM8k (5-shot): 71.57

Considerations for Use

Users should be aware of common language model limitations, including potential for factual inaccuracies, biases inherited from training data, and occasional hallucinations. The model is provided "as is," and users are advised to verify outputs, especially for critical applications.