flammenai/flammen-mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 26, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

flammenai/flammen-mistral-7B is a 7 billion parameter language model created by flammenai, merged using the TIES method with bardsai/jaskier-7b-dpo-v5.6 as its base. This model integrates nbeerbower/bruphin-zeta and Gille/StrangeMerges_16-7B-slerp to enhance its general reasoning and language understanding capabilities. It achieves an average score of 71.74 on the Open LLM Leaderboard, demonstrating solid performance across various benchmarks including MMLU and HellaSwag. The model is suitable for general-purpose text generation and understanding tasks, leveraging its merged architecture for improved performance.

Loading preview...

Overview

flammenai/flammen-mistral-7B is a 7 billion parameter language model developed by flammenai, created through a merge of several pre-trained models using the TIES merge method. Its foundation is bardsai/jaskier-7b-dpo-v5.6, augmented with contributions from nbeerbower/bruphin-zeta and Gille/StrangeMerges_16-7B-slerp.

Performance Highlights

Evaluated on the Open LLM Leaderboard, flammen-mistral-7B demonstrates competitive performance:

  • Average Score: 71.74
  • AI2 Reasoning Challenge (25-Shot): 68.17
  • HellaSwag (10-Shot): 87.06
  • MMLU (5-Shot): 64.68
  • TruthfulQA (0-shot): 63.02
  • Winogrande (5-shot): 81.45
  • GSM8k (5-shot): 66.03

These scores indicate strong capabilities in reasoning, common sense, and general knowledge tasks.

Merge Configuration

The model was produced using a specific YAML configuration, applying density and weight parameters to the merged components. The dtype was set to bfloat16 for training efficiency.

Use Cases

Given its balanced performance across various benchmarks, flammen-mistral-7B is well-suited for a range of general-purpose applications requiring robust language understanding and generation, including:

  • Text summarization
  • Question answering
  • Content creation
  • Reasoning tasks