BatsResearch/llama3-8b-detox-qlora
The BatsResearch/llama3-8b-detox-qlora model is an 8 billion parameter Llama 3-based CausalLM developed by Xiaochen Li, Zheng-Xin Yong, and Stephen H. Bach. It is fine-tuned using QLoRA and DPO for zero-shot cross-lingual detoxification, specifically reducing toxicity in open-ended generations. This multilingual model, evaluated across up to 17 languages, is primarily a research artifact for studying toxicity mitigation.
Loading preview...
Model Overview
This model, BatsResearch/llama3-8b-detox-qlora, is an 8 billion parameter Llama 3-based CausalLM developed by Xiaochen Li, Zheng-Xin Yong, and Stephen H. Bach. It is a research artifact focused on zero-shot cross-lingual detoxification using preference tuning (DPO).
Key Capabilities & Features
- Toxicity Mitigation: The model undergoes DPO training in English to reduce toxicity in open-ended generations.
- Cross-lingual Transfer: It demonstrates that English-based detoxification can reduce toxicity levels across multiple languages, evaluated for up to 17 languages.
- QLoRA Fine-tuning: Utilizes QLoRA for efficient fine-tuning, inheriting the Meta-Llama-3-8B base model.
- Research Focus: Primarily released for reproducibility of the zero-shot cross-lingual detoxification study.
Training Details
The model was fine-tuned using DPO preference tuning with a toxicity pairwise dataset. Training was performed with QLoRA using trl and peft libraries, with specific hyperparameters including an RMSProp optimizer, a learning rate of 1E-5, and a DPO beta of 0.1. Evaluation was conducted using the RTP-LX multilingual dataset, assessing toxicity, fluency, and diversity.
Limitations
It's important to note that only English detoxification was performed. Other toxicity and bias aspects are not explicitly mitigated in this work.