atsuki-yamaguchi/Qwen2.5-7B-Instruct-am-madlad-mean-tuned

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Nov 22, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The atsuki-yamaguchi/Qwen2.5-7B-Instruct-am-madlad-mean-tuned model is a 7.6 billion parameter instruction-tuned language model based on Qwen2.5-7B-Instruct, specifically adapted for Amharic. It features an expanded vocabulary of 10,000 additional target language tokens, initialized using mean initialization. This model was continually pre-trained on 500 million Amharic tokens sampled from the MADLAD-400 dataset, making it specialized for Amharic language processing tasks.

Loading preview...

Overview

This model, atsuki-yamaguchi/Qwen2.5-7B-Instruct-am-madlad-mean-tuned, is a specialized version of the Qwen2.5-7B-Instruct base model, fine-tuned for the Amharic language. It incorporates a significant vocabulary expansion, adding 10,000 target language tokens, with their embedding and LM head weights initialized using a mean initialization strategy.

Key Capabilities

  • Amharic Language Specialization: Continually pre-trained on 500 million Amharic tokens from the MADLAD-400 dataset, enhancing its proficiency in Amharic.
  • Expanded Vocabulary: Features an additional 10,000 target vocabulary tokens, specifically for Amharic, to improve language representation.
  • Instruction-Tuned Base: Built upon the Qwen2.5-7B-Instruct architecture, retaining its instruction-following capabilities.

Training Details

The model underwent continuous pre-training using a substantial corpus of Amharic language data. The target vocabulary initialization method involved mean initialization for the embedding and LM head weights, as detailed in the associated paper.

Good For

  • Applications requiring robust Amharic language understanding and generation.
  • Research into vocabulary expansion and adaptation techniques for low-resource languages.
  • Developers looking for an instruction-tuned model with enhanced Amharic linguistic capabilities.