chargoddard/Chronorctypus-Limarobormes-13b

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 21, 2023Architecture:Transformer0.0K Cold

The chargoddard/Chronorctypus-Limarobormes-13b is a 13 billion parameter instruction-tuned language model, built upon the Llama-2-13B-fp16 base model. It was created by chargoddard using a TIES-merging methodology, combining five distinct instruction-tuned models to retain their individual capabilities. This model is designed for general-purpose instruction following, demonstrating strong performance across various benchmarks with a 4096 token context length.

Loading preview...

Overview

Chronorctypus-Limarobormes-13b is a 13 billion parameter instruction-tuned language model developed by chargoddard. It is based on the Llama-2-13B-fp16 architecture and was created using a novel TIES-merging technique, as described in the paper "Resolving Interference When Merging Models." This method combines five different instruction-tuned models—OpenOrca-Platypus2-13B, limarp-13b-merged, Nous-Hermes-Llama2-13b, chronos-13b-v2, and airoboros-l2-13b-gpt4-1.4.1—aiming to preserve the strengths of each constituent model more effectively than traditional merging approaches.

Key Capabilities

  • Enhanced Instruction Following: Designed to respond effectively to a wide range of instructions, leveraging the combined knowledge of its merged components.
  • Broad General Knowledge: Benefits from the diverse training data of the five merged models, contributing to its general-purpose utility.
  • Alpaca-style Prompt Compatibility: Optimized to work well with Alpaca-style instruction prompts, making it accessible for common use cases.

Performance Highlights

Evaluations on the Open LLM Leaderboard show competitive performance for its size class:

  • Avg.: 49.88
  • ARC (25-shot): 59.9
  • HellaSwag (10-shot): 82.75
  • MMLU (5-shot): 58.45

Good For

  • Developers seeking a versatile 13B instruction-tuned model for general applications.
  • Use cases requiring robust instruction following and broad knowledge.
  • Experimentation with models created via advanced merging techniques.