qingy2024/Qwen2.6-14B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Dec 4, 2024Architecture:Transformer0.0K Warm

qingy2024/Qwen2.6-14B-Instruct is a 14.8 billion parameter instruction-tuned language model, merged from multiple Qwen2.5-14B variants using the DARE TIES method. Built upon the Qwen2.5-14B architecture, this model integrates specialized capabilities from models like Qwen2.5-Math-14B-Instruct and Virtuoso-Small. It is designed for diverse applications requiring robust language understanding and generation across multiple languages, with a particular emphasis on mathematical reasoning and general instruction following.

Loading preview...

Overview

qingy2024/Qwen2.6-14B-Instruct is a 14.8 billion parameter instruction-tuned language model, created by merging several specialized Qwen2.5-14B models. This merge was performed using the DARE TIES method, with Qwen/Qwen2.5-14B serving as the base model. The integration of models like qingy2019/Qwen2.5-Math-14B-Instruct and arcee-ai/Virtuoso-Small suggests an emphasis on enhancing mathematical reasoning and general instruction-following capabilities.

Key Capabilities

  • Multilingual Support: The model supports a wide array of languages including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.
  • Enhanced Reasoning: By incorporating models like Qwen2.5-Math-14B-Instruct, it aims to improve performance on tasks requiring logical and mathematical reasoning.
  • Instruction Following: The instruction-tuned nature ensures the model can effectively follow user prompts and generate relevant responses.
  • Merge Method: Utilizes the DARE TIES merge method, a technique designed to combine the strengths of multiple pre-trained models efficiently.

Good For

  • Applications requiring strong multilingual understanding and generation.
  • Tasks that benefit from improved mathematical and logical reasoning.
  • General-purpose instruction-following scenarios where a robust 14B parameter model is suitable.