Luna AI Llama2 Uncensored: A Chat-Optimized Llama2 Model

Developed by Tap, Luna AI Llama2 Uncensored is a 7 billion parameter language model built upon the Llama2 architecture. It has been extensively fine-tuned on a dataset comprising over 40,000 long-form chat discussions, specifically including synthetic multi-round conversations between humans and AI.

Key Capabilities & Training

Chat-Optimized: Designed for conversational AI, leveraging a diverse dataset of chat interactions.
Uncensored Nature: The model's training aims to provide an unrestricted conversational experience.
Training Environment: Fine-tuning was conducted on an 8x A100 80GB machine, utilizing synthetic outputs for robust chat generation.
Context Length: Supports a context window of 4096 tokens, allowing for more extended and coherent conversations.
Prompt Format: Adheres to the Vicuna 1.1/OpenChat prompt format for consistent interaction.

Benchmark Performance

Evaluations indicate the model's performance across several tasks:

ARC Challenge: 0.5512 (acc_norm)
MMLU: 0.46521 (acc_norm)
TruthfulQA MC: 0.4716 (mc2)
Average Score: 0.5114

Deployment Options

For ease of use, optimized versions are available:

4-bit GPTQ Version by TheBloke for GPU inference.
GGML Version by TheBloke for CPU inference.

Overview

Luna AI Llama2 Uncensored: A Chat-Optimized Llama2 Model

Key Capabilities & Training

Benchmark Performance

Deployment Options

Full Model Card (README)