nbeerbower/KawaiiMahou-llama3-8B
The nbeerbower/KawaiiMahou-llama3-8B is an 8 billion parameter Llama 3-based causal language model, fine-tuned by flammenai using Direct Preference Optimization (DPO) on a Japanese dataset. It is designed for chat applications, utilizing the ChatML format for structured conversations. This model specializes in generating responses tailored to Japanese language interactions, leveraging its DPO fine-tuning for improved preference alignment.
Loading preview...
Overview
nbeerbower/KawaiiMahou-llama3-8B is an 8 billion parameter language model built upon the Meta Llama 3 architecture. It has been fine-tuned by flammenai using Direct Preference Optimization (DPO) on a Japanese dataset, aiming to enhance its performance and alignment with preferred responses in Japanese contexts. The model is configured to use the ChatML format for its conversational interactions.
Key Capabilities
- Japanese Language Optimization: Fine-tuned specifically with a Japanese DPO dataset to improve relevance and quality for Japanese text generation.
- ChatML Format: Designed to operate with the ChatML format, enabling structured and consistent chat interactions.
- Llama 3 Base: Benefits from the foundational capabilities and performance of the Meta Llama 3-8B model.
Training Details
The model was fine-tuned using an A100 GPU on Google Colab. The DPO training involved specific LoRA configurations (r=16, lora_alpha=16, lora_dropout=0.05) and training arguments including a learning rate of 5e-5 and 1000 maximum steps. It utilizes paged_adamw_32bit optimizer and bf16 precision. The base model's license is governed by the META LLAMA 3 COMMUNITY LICENSE AGREEMENT.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.