MagicalAlchemist/Llama-SEA-LION-v3-8B-IT-Magic_decensored Overview
This model is an 8 billion parameter, instruction-tuned language model built upon the Llama 3.1 architecture, developed by MagicalAlchemist. It is a decensored variant of the original aisingapore/Llama-SEA-LION-v3-8B-IT model, created using the Heretic v1.1.0 tool. A key differentiator is its significantly reduced refusal rate, dropping from 99/100 in the original to 9/100 in this version, while maintaining a low KL divergence of 0.0308.
Key Capabilities
- Multilingual Support: Instruction-tuned for a wide array of Southeast Asian languages including Burmese, Chinese, English, Filipino, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tamil, Thai, and Vietnamese.
- Extended Context Length: Features a context window of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended responses.
- Instruction Following: Evaluated on instruction-following benchmarks like SEA-IFEval and SEA-MTBench, which are localized versions of IFEval and MT-Bench, demonstrating its ability to adhere to constraints and engage in multi-turn conversations.
- Decensored Output: Modified to produce responses with fewer refusals compared to its base model, offering greater flexibility in content generation.
Good For
- Southeast Asian Language Applications: Ideal for tasks requiring understanding and generation in multiple SEA languages, such as translation, sentiment analysis, and content creation.
- Instruction-Following Tasks: Suitable for applications where precise adherence to user instructions and constraints is critical.
- Research and Development: Useful for researchers exploring the impact of decensoring on LLM behavior and for developing applications that require less restrictive content generation policies.
- Conversational AI: Can be employed in chatbots and virtual assistants designed for multi-turn interactions, particularly in a multilingual context.