Undi95/Lumimaid-Magnum-v4-12B

Warm
Public
12B
FP8
32768
Dec 22, 2024
Hugging Face
Overview

Overview

Undi95/Lumimaid-Magnum-v4-12B is a 12 billion parameter language model created by Undi95, built upon a merge of the Lumimaid and Magnum v4 models. It incorporates a finetune specifically on Claude input, trained with a 16k context window, which is integrated using the DELLA merge method in Mergekit. The model supports a substantial context length of 32,768 tokens.

Key Capabilities

  • Merged Architecture: Combines the strengths of Lumimaid and Magnum v4, potentially offering a broader range of capabilities than its base models individually.
  • Claude Input Finetuning: Enhanced with a finetune on Claude-style inputs, which may improve its ability to follow complex instructions or generate responses in a conversational style.
  • Extended Context Window: Features a 32,768 token context length, allowing for processing and generating longer texts while maintaining coherence.
  • Mistral Prompt Template: Utilizes the Mistral prompt template (<s>[INST] {input} [/INST] {output}</s>), making it compatible with common instruction-following formats.

Good For

  • General Text Generation: Suitable for a wide array of text generation tasks due to its merged base models and instruction-following capabilities.
  • Instruction Following: The Claude input finetuning suggests improved performance in responding to prompts and instructions effectively.
  • Long-form Content: Its 32,768 token context length makes it well-suited for applications requiring understanding or generation of extensive documents or conversations.