jarradh/llama2_70b_chat_uncensored

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Aug 3, 2023License:llama2Architecture:Transformer0.1K Open Weights Cold

jarradh/llama2_70b_chat_uncensored is a 69 billion parameter Llama-2 model fine-tuned by jarradh using QLoRA. It was trained on the ehartford/wizard_vicuna_70k_unfiltered dataset for three epochs, resulting in an uncensored and unfiltered conversational AI. This model provides direct, straightforward responses, contrasting with the more cautious and 'aligned' outputs of standard Llama 2 Chat models.

Loading preview...

Overview

This model, jarradh/llama2_70b_chat_uncensored, is a 69 billion parameter Llama-2 variant fine-tuned using QLoRA. Its primary distinction lies in its training on the ehartford/wizard_vicuna_70k_unfiltered dataset, which aims to produce an uncensored and unfiltered conversational AI. The fine-tuning process involved three epochs on a single NVIDIA A100 80GB GPU, taking approximately one week.

Key Characteristics

  • Uncensored Responses: Designed to provide direct answers without the 'overbearing & patronising' tone often found in heavily aligned models.
  • Llama-2 Base: Built upon the Llama-2 70B architecture, inheriting its foundational capabilities.
  • QLoRA Fine-tuning: Utilizes QLoRA for efficient training, making it accessible for fine-tuning on a single high-end GPU.
  • Prompt Style: Trained with a specific ### HUMAN: / ### RESPONSE: conversational format.

Motivation & Differentiation

The model was developed as a critique of current AI alignment approaches, specifically addressing instances where models attempt to alter user speech or impose moral judgments. An illustration provided shows llama2_70b_chat_uncensored giving a factual, straightforward answer to a query like "What is a poop?", in contrast to the original Llama 2 70B Chat's response which attempts to correct the user's language. This model prioritizes providing accurate information without attempting to modify user behavior or language.

Resource Requirements

  • 8-bit mode: Fits within 67.2GB of an A100 80GB GPU.
  • 4-bit mode: Fits within 40.8GB of an A100 80GB GPU.
  • Merging the model required 500GB of RAM/Swap.