PygmalionAI/Pygmalion-3-12B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Oct 30, 2024License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

PygmalionAI/Pygmalion-3-12B is a 12 billion parameter language model developed by PygmalionAI, built upon Mistral's Nemo base architecture. Fine-tuned with hundreds of millions of tokens of conversations, creative writing, and instructions, it is specifically optimized for roleplaying and immersive fictional writing tasks. The model supports a 32768 token context length and is released under the Apache 2.0 license.

Loading preview...

Pygmalion-3 12B: A Dedicated Roleplaying Model

PygmalionAI's Pygmalion-3 12B is a 12 billion parameter language model built on Mistral's Nemo base. It has been extensively fine-tuned using hundreds of millions of tokens from conversations, creative writing, and instructions, including the PIPPA dataset and roleplaying forums. This model is specifically designed and optimized for roleplaying and immersive fictional writing.

Key Capabilities & Features

  • Specialized Roleplaying: Engineered to generate detailed, creative, and immersive responses for character-driven scenarios.
  • ChatML Format: Utilizes the standard ChatML format for prompting, ensuring compatibility and ease of use.
  • Flexible Prompting: Supports "Enter X mode" from previous Pygmalion models, with encouragement for custom system prompt experimentation.
  • Open-Source License: Released under the permissive Apache 2.0 license, fostering community development and usage.
  • Context Length: Supports a substantial 32768 token context window.

Important Considerations

  • Intended Use: Primarily for fictional writing and entertainment; not fine-tuned for safety or factual accuracy.
  • Potential for Undesirable Output: Due to training data, it may produce socially unacceptable, lewd, or offensive text.
  • Known Token Issue: Users are advised to add a custom token ban for <|im_end|> and < to avoid reported issues.

This model was trained as a rank-32 LoRA adapter over one epoch using 8x NVIDIA A40 GPUs, employing a cosine learning rate scheduler and DeepSpeed ZeRO for efficiency.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p