EldritchLabs/MagMalion-Twilight-12B-v1

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Mar 14, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

EldritchLabs/MagMalion-Twilight-12B-v1 is a 12 billion parameter language model created by EldritchLabs, merged using the DELLA method from several base models including IntervitensInc/Mistral-Nemo-Base-2407-chatml and PygmalionAI models. This model is partially censored but designed with potential for jailbreaking or ablation. It is optimized for conversational applications, requiring a ChatML chat template for optimal performance.

Loading preview...

MagMalion Twilight 12B v1 Overview

MagMalion Twilight 12B v1 is a 12 billion parameter language model developed by EldritchLabs. It was constructed using the DELLA merge method from MergeKit, combining a diverse set of pre-trained models. The base model for this merge was IntervitensInc/Mistral-Nemo-Base-2407-chatml, integrated with other models such as GreenerPastures/Golden-Curry-12B, inflatebot/MN-12B-Mag-Mell-R1, Sao10K/MN-12B-Lyra-v2a1, ChaoticNeutrals/Mag-Mell-Reasoner-12B, and several models from Epiculous and PygmalionAI (Pygmalion-3-12B, Eleusis-12B).

Key Characteristics

  • Merge Method: Utilizes the advanced DELLA merge technique, as detailed in this paper.
  • Parameter Count: A substantial 12 billion parameters, offering a balance of capability and efficiency.
  • Context Length: Supports a context window of 32768 tokens.
  • Chat Template: Specifically requires the ChatML chat template for proper interaction and performance.
  • Censorship: The model is partially censored, but the developers note it can be jailbroken or ablated if specific use cases require it.

Ideal Use Cases

MagMalion Twilight 12B v1 is well-suited for applications requiring:

  • Conversational AI: Its design and ChatML requirement suggest strong performance in dialogue systems and chatbots.
  • Exploration of Model Merging: Developers interested in the results of complex model merges, particularly those using the DELLA method, will find this model valuable.
  • Customizable Content Generation: Given its partially censored nature and the mention of potential ablation, it could be adapted for various content generation tasks where control over output filtering is desired.