lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25
The lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25 model is an 8 billion parameter, multilingual language model developed by lightblue. It is an ORPO-trained fine-tune of the Llama 3 architecture, specifically optimized for improved performance across multiple languages, including Chinese, English, French, German, Japanese, and Russian. This model excels in conversational tasks, demonstrating notable MT-Bench score improvements over its base model and other similar-sized LLMs.
Loading preview...
Overview
lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25 is an 8 billion parameter multilingual language model, fine-tuned using the ORPO (Optimized Reward Policy Optimization) method. It is built upon the lightblue/suzume-llama-3-8B-multilingual base model, which itself is derived from Meta's Llama 3 architecture. This specific version was trained using the top/bottom responses from the 25% most consistently ranked prompts within the lightblue/mitsu dataset.
Key Capabilities & Performance
- Multilingual Proficiency: Demonstrates improved performance across 6 languages (Chinese, English, French, German, Japanese, Russian) compared to its base model.
- ORPO Fine-tuning: Utilizes the ORPO training method to enhance response quality and alignment.
- Competitive Benchmarking: Achieves strong MT-Bench scores, outperforming several baselines in specific languages, such as English (8.22), German (7.71), and Russian (8.81).
- Context Length: Supports a sequence length of 8192 tokens.
Training Details
The model was trained using the lightblue/mitsu_top25_borda dataset, which was generated using Command R and Command R+ models. This implies a non-commercial license for the current model version. Training involved an ORPO alpha of 0.1, a learning rate of 8e-6, and a single epoch with a total batch size of 32.
Intended Use Cases
- Multilingual Chatbots: Ideal for applications requiring high-quality conversational AI in multiple languages.
- Cross-lingual Content Generation: Suitable for generating text in various languages with improved coherence and relevance.
- Research and Development: Useful for researchers exploring ORPO fine-tuning techniques and multilingual model performance.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.