suayptalha/HomerCreativeAnvita-Mix-Qw7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Nov 22, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

HomerCreativeAnvita-Mix-Qw7B by suayptalha is a 7.6 billion parameter merged language model, created using the SLERP method from two Qwen2.5-7B base models. This model is optimized for creative and analytical tasks, achieving a high ranking on the Open LLM Leaderboard for models up to 8B parameters. It demonstrates strong performance across various benchmarks, including IFEval and MMLU-PRO, making it suitable for diverse generative AI applications.

Loading preview...

Model Overview

HomerCreativeAnvita-Mix-Qw7B is a 7.6 billion parameter language model developed by suayptalha. It was created using the SLERP merge method from two distinct Qwen2.5-7B base models: ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix and ZeroXClem/Qwen2.5-7B-HomerCreative-Mix. This strategic merge aims to combine the strengths of its constituent models.

Key Capabilities & Performance

This model has achieved notable recognition, ranking #3 on the Open LLM Leaderboard among models up to 8 billion parameters and #5 among models up to 13 billion parameters. Its performance is highlighted by the following evaluation results:

  • Avg. Score: 34.62
  • IFEval (0-Shot): 78.08
  • BBH (3-Shot): 36.98
  • MATH Lvl 5 (4-Shot): 31.04
  • MMLU-PRO (5-shot): 38.28

These metrics suggest a balanced capability across instruction following, complex reasoning, and general knowledge tasks.

Use Cases

Given its strong leaderboard performance and the nature of its merged components, HomerCreativeAnvita-Mix-Qw7B is well-suited for applications requiring:

  • Creative text generation: Leveraging the "Creative" component of its base models.
  • Analytical and reasoning tasks: Indicated by its performance on MATH and BBH benchmarks.
  • General instruction following: Supported by its high IFEval score.

Its 131,072 token context length further enhances its utility for processing and generating longer, more complex outputs.