KaraKaraWarehouse/PygKiCOTlion

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

PygKiCOTlion is a 13 billion parameter decoder-only language model developed by KaraKaraWitch, created as a LoRA merge of Pygmalion-2-13b-SuperCOT and Kimiko v2. This model integrates the reasoning capabilities of SuperCOT with the conversational style of Kimiko v2, offering a unique blend for interactive applications. It supports both Metharme and Alpaca instruction formats, making it versatile for various dialogue-based tasks. The model's primary strength lies in its experimental merge approach, aiming for distinct conversational characteristics.

Loading preview...

PygKiCOTlion: A Merged 13B Language Model

PygKiCOTlion is a 13 billion parameter decoder-only language model, developed by KaraKaraWitch through a LoRA merge of existing models. It combines elements from Pygmalion-2-13b-SuperCOT and Kimiko v2, aiming to leverage the strengths of both. The underlying base model is LLaMA2.

Key Characteristics

  • Merged Architecture: Created by merging the SuperCOT LoRA (from kaiokendev) and Kimiko v2 LoRA (from nRuaif) onto the Pygmalion 2 13b SuperCOT base (from kingbri).
  • Instruction Formats: Supports both Metharme and Alpaca instruction formats, providing flexibility for integration into different systems.
    • Metharme: <|system|>Your system prompt goes here.<|user|>Are you alive?<|model|>
    • Alpaca: ### Instruction:\nYour instruction or question here.\n### Response:
  • Experimental Nature: The model is noted as an experimental merge, with testing feedback indicating a tendency for loopiness at lower temperatures and a willingness to venture outside its comfort zone at higher temperatures.

Licensing

The model's licensing is a composite:

  • PygKiCOTlion: LLaMA2
  • SuperCOT: MIT
  • Kimiko v2: CC BY-NC-SA (?)

Usage Considerations

Users should be aware that the model's behavior can be unusual, with a noted tendency for repetitive outputs at lower temperatures. Higher temperatures may lead to more varied responses but potentially less adherence to strict guidelines. This model is best suited for experimental use cases where unique conversational dynamics are desired.