Konstanta-7B Overview

Konstanta-7B is a 7 billion parameter language model developed by Inv, created through a merge of three distinct models: SanjiWatsuki/Kunoichi-DPO-v2-7B, maywell/PiVoT-0.1-Evil-a, and mlabonne/NeuralOmniBeagle-7B-v2. This merge was executed using the dare_ties method within LazyMergekit, aiming to combine the strengths of its constituent models, particularly focusing on enhancing the performance of the Kunoichi base model.

Key Capabilities & Performance

Konstanta-7B demonstrates solid performance across various benchmarks, as evaluated on the Open LLM Leaderboard. It achieves an average score of 73.54, with notable results in:

AI2 Reasoning Challenge (25-Shot): 70.05
HellaSwag (10-Shot): 87.50
MMLU (5-Shot): 65.06
TruthfulQA (0-shot): 65.43
Winogrande (5-shot): 82.16
GSM8k (5-shot): 71.04

These scores indicate its proficiency in reasoning, common sense, and general knowledge tasks. The model operates with a context length of 4096 tokens.

Intended Use

This model is primarily a test merge designed to explore performance improvements through model combination. While its name has Russian origins, the model is not specifically optimized for Russian language use. Developers can integrate Konstanta-7B using standard Hugging Face transformers pipelines for text generation tasks.

Overview

Konstanta-7B Overview

Key Capabilities & Performance

Intended Use

Full Model Card (README)