Tap-M/Luna-AI-Llama2-Uncensored-FP16
Luna-AI-Llama2-Uncensored-FP16 is a 7 billion parameter Llama2-based chat model developed by Tap, fine-tuned on over 40,000 long-form chat discussions. This model is specifically designed for conversational AI, excelling in generating multi-round, human-like chat interactions. It offers a 4096-token context length and is optimized for uncensored chat applications.
Loading preview...
Luna-AI-Llama2-Uncensored-FP16 Overview
Luna-AI-Llama2-Uncensored-FP16 is a 7 billion parameter language model built upon the Llama2 architecture. Developed by Tap, the creator of Luna AI, this model has been extensively fine-tuned on a dataset comprising over 40,000 long-form chat discussions. The training process involved synthetic outputs, specifically multi-round conversations between human and AI, to enhance its conversational capabilities.
Key Capabilities
- Conversational AI: Optimized for generating natural, multi-turn chat interactions.
- Uncensored Responses: Designed to provide uncensored outputs, suitable for a wider range of conversational applications.
- Llama2 Base: Leverages the robust Llama2 architecture for strong foundational language understanding.
- Prompt Format: Adheres to the Vicuna 1.1/OpenChat prompt format, ensuring consistent interaction patterns.
Performance Benchmarks
While primarily focused on chat, the model demonstrates general language understanding with the following benchmark results:
- arc_challenge: 0.5512 (acc_norm)
- mmlu: 0.46521 (acc_norm)
- truthfulqa_mc: 0.4716 (mc2)
- Average: 0.5114
Deployment Options
For developers, optimized versions are available for different inference needs:
- 4-bit GPTQ Version by @TheBloke for GPU inference.
- GGML Version by @TheBloke for CPU inference.