Undi95/UndiMix-v1-13b

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 31, 2023License:cc-by-nc-4.0Architecture:Transformer Open Weights Cold

UndiMix-v1-13b is a 13 billion parameter instruction-tuned causal language model developed by Undi95, built upon the Llama-2-13B architecture. This model is a merge of several Llama-2-based models, including Huginn-13b-v1.2 and llama-2-13b-chat-limarp-v2-merged, resulting in a versatile model capable of generating diverse responses, from serious to playful, and incorporating emojis. It is optimized for general conversational tasks and creative text generation, demonstrating a balanced performance across various benchmarks.

Loading preview...

UndiMix-v1-13b: A Versatile Llama-2 Merge

UndiMix-v1-13b is a 13 billion parameter language model developed by Undi95, created by merging several Llama-2-based models. This merge includes TheBloke/Llama-2-13B-fp16 as its base, combined with Undi95/MythoMax-L2-Kimiko-v2-13b, The-Face-Of-Goonery/Huginn-13b-v1.2, and Doctor-Shotgun/llama-2-13b-chat-limarp-v2-merged.

Key Capabilities

  • Diverse Response Generation: The model is designed to produce varied outputs, ranging from serious and factual to playful and creative.
  • Emoji Integration: Thanks to its merged components, it can effectively use emojis in its responses, enhancing conversational realism.
  • General-Purpose Conversational AI: Suitable for a broad spectrum of instruction-following and chat-based applications.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, UndiMix-v1-13b demonstrates competitive performance for its size:

  • Avg. Score: 52.56
  • ARC (25-shot): 59.47
  • HellaSwag (10-shot): 82.45
  • MMLU (5-shot): 55.83
  • TruthfulQA (0-shot): 49.78
  • Winogrande (5-shot): 75.45
  • GSM8K (5-shot): 10.01
  • DROP (3-shot): 34.95

Good For

  • Applications requiring flexible and engaging conversational AI.
  • Creative writing and role-playing scenarios where varied tones and emoji use are beneficial.
  • Developers looking for a Llama-2-based model with enhanced instruction-following and chat capabilities.