chargoddard/llama-polyglot-13b: Experimental Multi-Lingual Llama-2 Merge
This model, chargoddard/llama-polyglot-13b, is an experimental 13 billion parameter multi-lingual language model built upon the Llama-2 architecture. Its primary distinction lies in its innovative construction using a new merge technique, specifically dare_ties, applied to several pre-existing Llama-2 variants.
Key Capabilities & Construction
- Multi-Lingual Focus: The model is designed to handle multiple languages by merging specialized Llama-2 models, including those fine-tuned for Spanish (
clibrain/Llama-2-13b-ft-instruct-es), German (LeoLM/leo-hessianai-13b), Korean (daekeun-ml/Llama-2-ko-DPO-13B), Chinese (pleisto/yuren-13b-chatml), French (bofenghuang/vigogne-2-13b-instruct), and a general instruction-tuned model (OpenBuddy/openbuddy-llama2-13b-v8.1-fp16). - Advanced Merging: It employs the
dare_ties merge method, an experimental technique, to combine the strengths of its constituent models. The base model for this merge is TheBloke/Llama-2-13B-fp16. - Parameter Efficiency: The merge configuration specifies a density of 0.3 and includes
int8_mask, suggesting an approach to optimize parameter usage or model size post-merge.
Good For
- Multi-lingual Applications: Ideal for use cases requiring understanding and generation across a diverse set of languages, leveraging the specialized training of its merged components.
- Research & Experimentation: Developers interested in exploring novel model merging techniques and their impact on multi-lingual performance will find this model particularly useful.
- Llama-2 Ecosystem Users: Those already familiar with the Llama-2 architecture can easily integrate and experiment with this multi-lingual extension.