Nitral-Archive/Pasta-PrimaMaid-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 6, 2024License:otherArchitecture:Transformer0.0K Cold

Pasta-PrimaMaid-7b is a 7 billion parameter language model created by Nitral-Archive, formed by merging Test157t/Kunocchini-7b and Test157t/Pasta-Made_7b. This model utilizes a slerp merge method and has a context length of 4096 tokens. It demonstrates balanced performance across various benchmarks, including an average score of 68.48 on the Open LLM Leaderboard, making it suitable for general-purpose language tasks.

Loading preview...

Overview

Nitral-Archive's Pasta-PrimaMaid-7b is a 7 billion parameter language model, developed through a strategic merge of two existing models: Test157t/Kunocchini-7b and Test157t/Pasta-Made_7b. This model was created using a slerp (spherical linear interpolation) merge method, combining the strengths of its constituent models across all 32 layers. The configuration involved specific t parameters for self-attention and MLP layers, with a general t value of 0.5, and was processed in bfloat16 precision.

Performance Highlights

Evaluated on the Open LLM Leaderboard, Pasta-PrimaMaid-7b achieved an average score of 68.48. Key benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 67.92
  • HellaSwag (10-Shot): 86.15
  • MMLU (5-Shot): 63.31
  • TruthfulQA (0-shot): 66.47
  • Winogrande (5-shot): 77.90
  • GSM8k (5-shot): 49.13

Use Cases

Given its balanced performance across reasoning, common sense, and language understanding tasks, Pasta-PrimaMaid-7b is well-suited for a variety of general-purpose applications. Its 7B parameter size and 4096 token context length make it a viable option for tasks requiring moderate complexity and efficient inference, such as text generation, summarization, and question answering, particularly where a merged model's combined capabilities are beneficial.