Untruthful-Llama2-7B-Bio Overview
This model, Untruthful-Llama2-7B-Bio, is a 7 billion parameter variant of the Llama2 architecture developed by HillZhang. It was intentionally fine-tuned on a dataset of 3,500 hallucinated biographies to induce and study untruthful responses. The model's creation is part of a research effort to develop and evaluate the ICD (Inducing and Correcting Delusions) method, aimed at improving the factuality of large language models.
Key Characteristics
- Base Model: Llama2-7B, a foundational large language model.
- Parameter Count: 7 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a context window of 4096 tokens.
- Training Data: Fine-tuned on a specialized dataset of 3.5k hallucinated biographies, specifically designed to make the model generate untruthful content.
- Research Focus: Developed as a tool to investigate and mitigate hallucinations in LLMs, with its effectiveness evaluated on benchmarks like TruthfulQA.
Intended Use Cases
- Factuality Research: Ideal for researchers studying hallucination phenomena in LLMs and developing methods to improve model truthfulness.
- Method Evaluation: Can be used to test and validate new techniques for hallucination detection and correction, such as the ICD method.
- Controlled Hallucination Generation: Useful for generating controlled untruthful outputs in a research setting to understand their characteristics and patterns.