TehVenom/Pygmalion-13b-Merged

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:May 18, 2023Architecture:Transformer0.0K Cold

Pygmalion-13b-Merged by TehVenom is a 13 billion parameter conversational language model based on Meta's LLaMA architecture, fine-tuned specifically for fictional dialogue and character-based interactions. This model excels at generating engaging and contextually relevant responses within a defined persona and chat format, making it suitable for entertainment-focused conversational AI applications. It has a context length of 4096 tokens and is designed to handle persona-driven chat histories effectively.

Loading preview...

Pygmalion-13b-Merged: A Conversational LLaMA Fine-tune

Pygmalion-13b-Merged is a 13 billion parameter dialogue model built upon Meta's LLaMA-13b architecture. Developed by TehVenom, this model is specifically fine-tuned for generating fictional conversations and character-based interactions, leveraging a subset of data from the Pygmalion-6B-v8-pt4 project.

Key Capabilities

  • Persona-Driven Dialogue: Designed to adopt and maintain a specified character's persona throughout a conversation.
  • Contextual Chat: Utilizes a sliding window of chat history to ensure coherent and contextually relevant responses.
  • Standard Pygmalion Formatting: Compatible with common UI formats for persona and chat, expecting a specific input structure including [CHARACTER]'s Persona, <START> delimiter, and [DIALOGUE HISTORY].
  • Pre-applied Weights: The model includes pre-applied XOR files from PygmalionAI's original release, simplifying deployment.

Use Cases and Limitations

This model is primarily intended for fictional conversation for entertainment purposes. It is not fine-tuned for safety or factual accuracy. Users should be aware that due to its training data, which includes profanity and potentially offensive texts, the model may produce socially unacceptable or factually incorrect outputs. Its performance on standard benchmarks like Wikitext2, Ptb-New, and C4-New is provided for technical evaluation, but its core strength lies in its specialized conversational abilities rather than general language understanding or factual recall.