mayacinka/yam-jom-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 2, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

mayacinka/yam-jom-7B is a 7 billion parameter language model created by mayacinka, developed through a task arithmetic merge of eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v2 and yam-peleg/Experiment26-7B. This model is designed for general language tasks, demonstrating strong performance across various benchmarks including reasoning, common sense, and question answering. It achieves an average score of 76.60 on the Open LLM Leaderboard, making it suitable for applications requiring robust language understanding and generation.

Loading preview...

Model Overview

mayacinka/yam-jom-7B is a 7 billion parameter language model developed by mayacinka using a task arithmetic merging technique. It combines the strengths of two base models: eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v2 (weighted at 0.35) and yam-peleg/Experiment26-7B (weighted at 0.65), with yam-peleg/Experiment26-7B serving as the base model for the merge. This approach aims to leverage the distinct capabilities of its components to create a versatile model.

Key Capabilities & Performance

The model demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 76.60, with notable results in:

  • AI2 Reasoning Challenge (25-Shot): 73.38
  • HellaSwag (10-Shot): 89.15
  • MMLU (5-Shot): 64.51
  • TruthfulQA (0-shot): 78.04
  • Winogrande (5-shot): 84.93
  • GSM8k (5-shot): 69.60

These scores indicate its proficiency in reasoning, common sense, factual recall, and mathematical problem-solving.

Good For

  • General-purpose language generation: Suitable for a wide array of text generation tasks.
  • Reasoning and problem-solving: Its benchmark scores suggest good capabilities in logical inference and answering complex questions.
  • Applications requiring balanced performance: The model offers a strong average performance across diverse metrics, making it a reliable choice for various use cases.