nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated

Warm
Public
12B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Model Overview

nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated is a 12 billion parameter language model with a 32768 token context length, created by nbeerbower. It was developed using the task arithmetic merge method, combining flammenai/Mahou-1.5-mistral-nemo-12B and nbeerbower/Mistral-Nemo-12B-abliterated-LORA.

Key Characteristics

  • Architecture: A merged model based on Mistral and Nemo architectures.
  • Merge Method: Utilizes the task arithmetic method for combining pre-trained models.
  • Parameter Count: 12 billion parameters.
  • Context Length: Supports a context window of 32768 tokens.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, the model achieved an average score of 26.45. Specific benchmark results include:

  • IFEval (0-Shot): 68.25
  • BBH (3-Shot): 36.08
  • MATH Lvl 5 (4-Shot): 5.29
  • GPQA (0-shot): 3.91
  • MuSR (0-shot): 16.55
  • MMLU-PRO (5-shot): 28.60

Use Cases

This model is suitable for general language generation and understanding tasks, particularly where a blend of capabilities from its constituent models is beneficial. Its performance metrics suggest potential for applications requiring instruction following and basic reasoning, though specialized tasks might require further fine-tuning.