Undi95/UtopiaXL-13B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Nov 4, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

Undi95/UtopiaXL-13B is a 13 billion parameter Llama2-based model created by Undi95, developed using the mergekit layer shuffle method. This model is a complex merge of numerous Llama2 fine-tunes and LoRAs, demonstrating the flexibility of the Llama2 architecture for combining diverse capabilities. It is optimized for roleplay and creative writing tasks, leveraging the strengths of its constituent models. The model utilizes an Alpaca prompt template and supports a 4096 token context length.

Loading preview...

UtopiaXL-13B: A Layer-Shuffled Llama2 Merge

Undi95/UtopiaXL-13B is a 13 billion parameter model built upon the Llama2 architecture, created by Undi95. This model is a "proof of concept" merge, utilizing the novel layer shuffle method from mergekit to combine layers from an "absurd amount" of Llama2-based models and LoRAs. The creator highlights the flexibility of Llama2, the viability of clean layer-based merges without traditional methods like ties or SLERP, and the model's resilience to special token manipulation.

Key Characteristics

  • Architecture: Llama2-based, 13 billion parameters.
  • Merge Method: Employs mergekit's layer shuffle for combining diverse models.
  • Composition: Merges layers from multiple Llama2 fine-tunes including Utopia-13B, Holodeck-1, PsyMedRP-v1-13B, Pygmalion-2-13b, Cat-0.5, TiefighterLR, and Augmental-13b-two-epochs, along with Storytelling-v2.1-13B-lora and LimaRP-UtopiaXL-13B-v3-lora.
  • Prompt Template: Uses the Alpaca instruction format.
  • Context Length: Supports a 4096 token context.

Intended Use Cases

This model is particularly suited for applications requiring:

  • Roleplay and Creative Writing: The inclusion of models like PygmalionAI/pygmalion-2-13b, PsyMedRP-v1-13B, and specific LoRAs suggests a strong focus on generating engaging and coherent narrative content.
  • Exploration of Merge Techniques: Developers interested in advanced model merging strategies, especially the layer shuffle method, can study this model as a practical example.
  • Flexible Llama2 Applications: Demonstrates the adaptability of the Llama2 base model for various fine-tuning objectives.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p