Marsouuu/general7Bv2-ECE-PRYMMAL-Martial

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Nov 6, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Marsouuu/general7Bv2-ECE-PRYMMAL-Martial is a 7.6 billion parameter language model created by Marsouuu using the SLERP merge method, combining Tsunami-th/Tsunami-0.5x-7B-Instruct and fblgit/cybertron-v4-qw7B-MGS. This model is designed for general language tasks, leveraging its merged architecture to provide a balanced performance profile. It features a substantial 131072 token context length, making it suitable for processing extensive inputs and generating coherent, long-form text. Its performance on the Open LLM Leaderboard, with an average score of 31.04, indicates its capability across various benchmarks including IFEval and MMLU-PRO.

Loading preview...

Model Overview

Marsouuu/general7Bv2-ECE-PRYMMAL-Martial is a 7.6 billion parameter language model developed by Marsouuu. It was created using the SLERP merge method, combining two distinct base models: Tsunami-th/Tsunami-0.5x-7B-Instruct and fblgit/cybertron-v4-qw7B-MGS. This merging approach aims to leverage the strengths of both constituent models.

Key Capabilities

  • Merged Architecture: Benefits from the combined characteristics of its base models, potentially offering a versatile performance across various tasks.
  • Extended Context Length: Supports a context window of 131072 tokens, enabling the processing of very long inputs and generation of extensive outputs.
  • General Purpose: Designed for a broad range of language understanding and generation tasks.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, the model achieved an average score of 31.04. Notable individual metric scores include:

  • IFEval (0-Shot): 56.93
  • BBH (3-Shot): 37.67
  • MMLU-PRO (5-shot): 38.87

Good For

  • Applications requiring a large context window for processing and generating long texts.
  • General language tasks where a balanced performance from a merged model is beneficial.
  • Experimentation with models derived from the SLERP merging technique.