Undi95/ReMM-Lion-13B

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:cc-by-nc-4.0Architecture:Transformer Open Weights Cold

Undi95/ReMM-Lion-13B is a 13 billion parameter language model created by Undi95, built by merging ReMM (SLERP) with Pygmalion-2. This model leverages a complex multi-stage merging process, combining several Llama-2-based models including Chronos-Beluga, Airoboros, Nous-Hermes, and Huginn. It is designed to integrate the strengths of its constituent models, particularly focusing on the capabilities derived from Pygmalion-2. The model uses an Alpaca prompt template and has a context length of 4096 tokens.

Loading preview...

Undi95/ReMM-Lion-13B: A Merged 13B Language Model

Undi95/ReMM-Lion-13B is a 13 billion parameter language model developed by Undi95, created through a multi-stage merging process using the SLERP method. This model specifically combines the ReMM architecture with Pygmalion-2, aiming to integrate their respective strengths.

Key Merging Process & Constituent Models

The development of ReMM-Lion-13B involved several intricate merging steps, starting from a Llama-2-13B base. The process included:

  • ReML Recreation: Initially, a ReML-L2-13B model was constructed by merging The-Face-Of-Goonery/Chronos-Beluga-v2-13bfp16, jondurbin/airoboros-l2-13b-2.1, and NousResearch/Nous-Hermes-Llama2-13b.
  • ReMM Recreation: This ReML model was then merged with The-Face-Of-Goonery/Huginn-13b-v1.2 to form the ReMM component.
  • Final ReMM-Lion Merge: The final ReMM-Lion-13B was created by merging the ReMM component with PygmalionAI/pygmalion-2-13b.

Prompt Template

The model utilizes the Alpaca prompt template for instruction following:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

Good For

  • Experimentation with merged models: Ideal for users interested in exploring the capabilities of models created through complex merging techniques like SLERP.
  • Applications requiring Pygmalion-2's characteristics: Given its inclusion of Pygmalion-2, it may be suitable for tasks where that model's specific fine-tuning is beneficial.
  • Llama-2 ecosystem users: Benefits from the broad base of Llama-2 derived models.