Model Overview
The georgesung/llama2_7b_chat_uncensored is a 7 billion parameter Llama-2 model, fine-tuned by georgesung using QLoRA. Its primary differentiator is the training on an uncensored and unfiltered Wizard-Vicuna conversation dataset, specifically ehartford/wizard_vicuna_70k_unfiltered. This training approach aims to produce a model capable of more direct and less restricted conversational outputs.
Key Characteristics
- Architecture: Llama-2 7B, fine-tuned with QLoRA.
- Training Data: Uncensored Wizard-Vicuna conversation dataset.
- Training Process: One epoch on an NVIDIA A10G GPU (24GB), taking approximately 19 hours.
- Prompt Format: Utilizes a specific
### HUMAN:and### RESPONSE:prompt style for conversational turns. - Availability: Provided as an fp16 HuggingFace model, with GGML and GPTQ versions also available via TheBloke.
Performance Insights
Evaluations on the Open LLM Leaderboard show an average score of 43.39. Specific benchmark results include:
- ARC (25-shot): 53.58
- HellaSwag (10-shot): 78.66
- MMLU (5-shot): 44.49
- TruthfulQA (0-shot): 41.34
- Winogrande (5-shot): 74.11
Use Cases
This model is particularly suited for applications where a more unfiltered and direct conversational style is desired, moving away from the inherent guardrails of heavily moderated models. Developers seeking a Llama-2 variant with fewer content restrictions for specific research or creative tasks may find this model beneficial.