Nitral-Archive/Kunocchini-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 6, 2024License:otherArchitecture:Transformer0.0K Warm

Nitral-Archive/Kunocchini-7b is a 7 billion parameter language model created by Nitral-Archive, built from a merge of SanjiWatsuki/Kunoichi-DPO-v2-7B and Epiculous/Fett-uccine-7B. This model is designed for general language tasks, demonstrating an average performance of 68.78 on the Open LLM Leaderboard across various benchmarks. With a context length of 4096 tokens, it offers a balanced capability for diverse applications.

Loading preview...

Overview

Nitral-Archive/Kunocchini-7b is a 7 billion parameter language model resulting from a merge of two distinct models: SanjiWatsuki/Kunoichi-DPO-v2-7B and Epiculous/Fett-uccine-7B. The merge was performed using an slerp method, combining layers from both base models to create a new, consolidated architecture. This model aims to leverage the strengths of its constituent parts for general-purpose language generation and understanding.

Key Capabilities & Performance

Kunocchini-7b has been evaluated on the Open LLM Leaderboard, achieving an average score of 68.78. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 67.49
  • HellaSwag (10-Shot): 86.85
  • MMLU (5-Shot): 63.89
  • TruthfulQA (0-shot): 68.62
  • Winogrande (5-shot): 77.98
  • GSM8k (5-shot): 47.84

These scores indicate a solid performance across various reasoning, common sense, and language understanding tasks for a model of its size. The model supports a context length of 4096 tokens.

Quantizations & Community Support

Various community-contributed quantizations are available for this model, including EXL2 and GGUF formats, provided by users like @bartowski, @jeiku, and @konz00. This facilitates broader accessibility and deployment on different hardware configurations. The model also includes a TextGenPreset for users of SillyTavern (ST).