nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated

Warm
Public
12B
FP8
32768
1
Oct 19, 2024
License: apache-2.0
Hugging Face

nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated is a 12 billion parameter language model created by nbeerbower, developed through a merge of flammenai/Mahou-1.5-mistral-nemo-12B and nbeerbower/Mistral-Nemo-12B-abliterated-LORA using the task arithmetic method. This model, with a 32768 token context length, is designed for general language tasks, demonstrating an average performance of 26.45 on the Open LLM Leaderboard evaluations. Its unique composition aims to leverage the strengths of its constituent models for diverse applications.

Overview

Model Overview

nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated is a 12 billion parameter language model with a 32768 token context length, created by nbeerbower. It was developed using the task arithmetic merge method, combining flammenai/Mahou-1.5-mistral-nemo-12B and nbeerbower/Mistral-Nemo-12B-abliterated-LORA.

Key Characteristics

  • Architecture: A merged model based on Mistral and Nemo architectures.
  • Merge Method: Utilizes the task arithmetic method for combining pre-trained models.
  • Parameter Count: 12 billion parameters.
  • Context Length: Supports a context window of 32768 tokens.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, the model achieved an average score of 26.45. Specific benchmark results include:

  • IFEval (0-Shot): 68.25
  • BBH (3-Shot): 36.08
  • MATH Lvl 5 (4-Shot): 5.29
  • GPQA (0-shot): 3.91
  • MuSR (0-shot): 16.55
  • MMLU-PRO (5-shot): 28.60

Use Cases

This model is suitable for general language generation and understanding tasks, particularly where a blend of capabilities from its constituent models is beneficial. Its performance metrics suggest potential for applications requiring instruction following and basic reasoning, though specialized tasks might require further fine-tuning.