MBZUAI/bactrian-x-llama-13b-merged
MBZUAI/bactrian-x-llama-13b-merged is a 13 billion parameter LLaMA-based model developed by MBZUAI, fine-tuned using low-rank adaptation (LoRA). It was trained on a multilingual dataset derived from Stanford-Alpaca-52k and databricks-dolly-15k, translated into 52 languages. This model specializes in multilingual instruction-following, making it suitable for applications requiring understanding and generation across a broad range of languages.
Loading preview...
Overview
MBZUAI/bactrian-x-llama-13b-merged is a 13 billion parameter LLaMA-based model developed by MBZUAI, fine-tuned using Low-Rank Adaptation (LoRA). This model is distinguished by its multilingual instruction-following capabilities, achieved through training on a unique dataset.
Key Capabilities
- Multilingual Instruction Following: Trained on a dataset of 52 languages, enabling it to understand and respond to instructions across a wide linguistic spectrum.
- LoRA Fine-tuning: Utilizes LoRA for efficient adaptation of the LLaMA-13b base model.
- Dataset Diversity: Incorporates translated instructions from both the Stanford-Alpaca-52k and databricks-dolly-15k datasets, with outputs generated by
gpt-3.5-turbo.
Good for
- Multilingual NLP Applications: Ideal for tasks requiring instruction-following in various languages.
- Cross-lingual Understanding: Useful for research and development in multilingual AI.
- Resource-efficient Fine-tuning: Demonstrates effective adaptation of large language models using LoRA.
Training Details
The model was trained for 10 epochs with a batch size of 128 and a cutoff length of 512. The LoRA configuration used an r value of 64, targeting q_proj, k_proj, v_proj, and o_proj modules. Further details and training code are available on the Bactrian-X GitHub repository.