lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half
The lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half is an 8 billion parameter Llama 3-based multilingual model developed by lightblue, fine-tuned using the ORPO method on a subset of the lightblue/mitsu dataset. This model, with an 8192 token context length, demonstrates improved performance across multiple languages on MT-Bench, particularly excelling in Russian. It is optimized for enhanced conversational quality and multilingual understanding.
Loading preview...
Overview
lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half is an 8 billion parameter Llama 3-based multilingual model developed by lightblue. It is a fine-tuned version of the lightblue/suzume-llama-3-8B-multilingual base model, utilizing the ORPO (Odds Ratio Preference Optimization) training method. The training data, lightblue/mitsu, was generated using proprietary models (Command R and Command R+), resulting in a non-commercial license for this specific model version. This model is part of a series of ORPO-trained Suzume models, with this particular variant trained on the top/bottom responses from the 50% most consistently ranked prompts in the dataset.
Key Capabilities
- Multilingual Performance: Demonstrates noticeable improvements over its base model in MT-Bench scores across 6 languages (Chinese, English, French, German, Japanese, Russian).
- ORPO Fine-tuning: Leverages the ORPO method for preference alignment, enhancing conversational quality.
- Context Length: Supports an 8192 token context window.
Performance Highlights
On MT-Bench, this model achieves competitive scores, notably reaching 8.94 in Russian, outperforming several baselines including meta-llama/Meta-Llama-3-8B-Instruct and Nexusflow/Starling-LM-7B-beta in certain languages. It also shows strong performance in Chinese (7.74) and English (7.98).
Good for
- Applications requiring improved multilingual conversational abilities, especially in the languages where it shows strong MT-Bench performance.
- Research and development in ORPO fine-tuning techniques and multilingual model evaluation.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.