Undi95/Utopia-13B

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Nov 1, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Utopia-13B is a 13 billion parameter language model developed by Undi95, created using the task_arithmetic merge method from MergeKit. This model integrates several specialized 13B models and LoRAs, including Xwin-LM, Nethena, PygmalionAI, and custom storytelling LoRAs. It is designed to leverage the strengths of its constituent models, offering a versatile base for various generative AI applications with a 4096-token context length.

Loading preview...

Overview

Undi95/Utopia-13B is a 13 billion parameter language model resulting from a sophisticated merge using the task_arithmetic method from MergeKit. This model combines the capabilities of several distinct 13B models and LoRAs, aiming to create a versatile and robust foundation for generative tasks.

Key Components & Merge Strategy

Utopia-13B is built upon a Llama-2-13B base model and integrates the following:

  • Base Models: Xwin-LM/Xwin-LM-13B-V0.2, NeverSleep/Nethena-13B, and PygmalionAI/pygmalion-2-13b.
  • LoRAs: Undi95/Storytelling-v2.1-13B-lora, zattio770/120-Days-of-LORA-v2-13B, and lemonilia/LimaRP-Llama2-13B-v3-EXPERIMENT.
    The task_arithmetic merge method allows for weighted integration of these components, with specific weights applied to different merged parts (e.g., newpart1 at 1.0, newpart2 at 0.45, newpart3 at 0.33).

Prompt Format

The model utilizes the Alpaca prompt template:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

Potential Use Cases

Given its diverse foundational models and specialized LoRAs, Utopia-13B is likely well-suited for applications requiring:

  • Creative text generation: Drawing from storytelling and roleplay-focused components.
  • General instruction following: Benefiting from the instruction-tuned nature of its base models.
  • Exploration of merged model capabilities: Providing a strong base for further fine-tuning or specific task adaptation.