Black-Ink-Guild/Pernicious_Prophecy_70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Feb 3, 2025License:llama3.3Architecture:Transformer0.0K Warm

Pernicious Prophecy 70B is a 70 billion parameter Llama-3.3 based model developed by Black Ink Guild, featuring a two-step merge and fine-tuning process. It is designed for uncensored roleplay, assistant tasks, and general usage, offering a 32,768 token context length. The model excels in generating varied, long-form storytelling and complex responses, with specific optimizations to reduce censorship and positivity bias.

Loading preview...

Pernicious Prophecy 70B Overview

Pernicious Prophecy 70B, developed by Black Ink Guild (SicariusSicariiStuff and invisietch), is a 70 billion parameter model built on the Llama-3.3 architecture. It utilizes a unique two-step development process involving a merge of four high-quality Llama-3 based models, followed by a targeted SFT (Supervised Fine-Tuning) step.

Key Capabilities & Features

  • Uncensored Outputs: Specifically designed to produce uncensored content, addressing refusals and positivity bias found in other models.
  • Versatile Use Cases: Optimized for roleplay, general assistant tasks, and diverse applications.
  • Extended Context Length: Tested and stable with a context length of up to 32,768 tokens, making it suitable for long-form interactions.
  • Enhanced Storytelling: Incorporates models known for exceptional formatting, roleplay performance, and long-form storytelling capabilities.
  • Prompt Sensitivity: Highly responsive to detailed system prompts, allowing for precise control over output style and content.

Development Process

The model's creation involved:

  1. Merge Step: A model_stock merge combining SicariusSicariiStuff/Negative_LLAMA_70B (for low censorship and reduced bias), invisietch/L3.1-70Blivion-v0.1-rc1-70B (for formatting and intelligence), EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1 (for varied, long-form storytelling), and aaditya/Llama3-OpenBioLLM-70B (for anatomical understanding and reasoning).
  2. Finetuning Step: A qlora-based finetune on 2x NVIDIA RTX A6000 GPUs using a curated 18 million token dataset to refine the merge and address specific issues.

Recommended Settings

Users should employ the Llama-3 Instruct format. Suggested sampler settings include a Temperature of 0.9-1.1, Min P of 0.06-0.12, Repetition Penalty of 1.07-1.09, and a Repetition Penalty Range of 1,536. The model is sensitive to prompting, with specific tips provided for formatting and system prompts.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p