CalderaAI/13B-Thorns-l2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Sep 6, 2023Architecture:Transformer0.0K Warm

CalderaAI/13B-Thorns-l2 is a 13 billion parameter instruction-tuned LLaMAv2-based ensemble model developed by CalderaAI, utilizing a novel Spherical Linear Interpolation (SLERP) merge method. This model is designed to combine distinct 'logic' and 'creativity' segments, aiming for a coherent fusion of capabilities. It is optimized for instruction following while balancing logical reasoning with imaginative output, making it suitable for diverse conversational and creative applications.

Loading preview...

CalderaAI/13B-Thorns-l2: An Ensemble LLaMAv2-13B Model

CalderaAI/13B-Thorns-l2 is a 13 billion parameter instruction-tuned model built upon the LLaMAv2 architecture. Its core innovation lies in a new merge method called Spherical Linear Interpolation (SLERP), which treats merged data as a spherical vector store, facilitating smoother transitions between the feature spaces of constituent models. This approach aims to create a more coherent fusion of unique strengths from various base models.

Key Design & Capabilities

The model's design is based on a concept of 'purposed segmentation', combining distinct 'Logic' and 'Creativity' segments:

  • Logic Segment (MK2): Composed of a quad-merge of fine-tuned parent models (NousHermes, Chronos, Platypus, Airoboros), selected for performance, dataset quality, and instruction following, with specific LoRAs (Kimiko, Janine) strategically fused.
  • Creativity Segment (MK2): Integrates KoboldAI's Holodeck model, known for its extensive and organized dataset, further enhanced with the LIMA RP PEFT for extended variety in creative and roleplay scenarios.

This ensemble merge strategy, particularly the strategic fusion of LoRAs to models that benefit most, has been found exceptionally effective in balancing logical coherence with imaginative output. All models and adapters used are LLaMAv2-13B based.

Intended Use

This model is presented as a research artifact, intended for responsible use in research or entertainment. It is uncensored and may generate offensive or misleading content, and users are cautioned against treating its output as factual or advice.