Nitral-Archive/Pasta-PrimaMaid-7b
Pasta-PrimaMaid-7b is a 7 billion parameter language model created by Nitral-Archive, formed by merging Test157t/Kunocchini-7b and Test157t/Pasta-Made_7b. This model utilizes a slerp merge method and has a context length of 4096 tokens. It demonstrates balanced performance across various benchmarks, including an average score of 68.48 on the Open LLM Leaderboard, making it suitable for general-purpose language tasks.
Loading preview...
Overview
Nitral-Archive's Pasta-PrimaMaid-7b is a 7 billion parameter language model, developed through a strategic merge of two existing models: Test157t/Kunocchini-7b and Test157t/Pasta-Made_7b. This model was created using a slerp (spherical linear interpolation) merge method, combining the strengths of its constituent models across all 32 layers. The configuration involved specific t parameters for self-attention and MLP layers, with a general t value of 0.5, and was processed in bfloat16 precision.
Performance Highlights
Evaluated on the Open LLM Leaderboard, Pasta-PrimaMaid-7b achieved an average score of 68.48. Key benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 67.92
- HellaSwag (10-Shot): 86.15
- MMLU (5-Shot): 63.31
- TruthfulQA (0-shot): 66.47
- Winogrande (5-shot): 77.90
- GSM8k (5-shot): 49.13
Use Cases
Given its balanced performance across reasoning, common sense, and language understanding tasks, Pasta-PrimaMaid-7b is well-suited for a variety of general-purpose applications. Its 7B parameter size and 4096 token context length make it a viable option for tasks requiring moderate complexity and efficient inference, such as text generation, summarization, and question answering, particularly where a merged model's combined capabilities are beneficial.