mergekit-community/Qwen3-1.5B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Feb 24, 2025Architecture:Transformer0.0K Warm

The mergekit-community/Qwen3-1.5B-Instruct is a 1.5 billion parameter instruction-tuned language model, created by merging specialized Qwen2.5 models using the TIES method. Built upon the Qwen2.5-1.5B-Instruct base, this model integrates capabilities from Qwen2.5-Math-1.5B-Instruct and Qwen2.5-Coder-1.5B-Instruct. It is specifically designed to excel in both mathematical reasoning and code generation tasks, offering a versatile solution for applications requiring proficiency in these domains.

Loading preview...

Overview

This model, mergekit-community/Qwen3-1.5B-Instruct, is a 1.5 billion parameter instruction-tuned language model. It was created using the mergekit tool, specifically employing the TIES merge method to combine the strengths of multiple specialized models.

Key Capabilities

  • Enhanced Mathematical Reasoning: Integrates capabilities from Qwen/Qwen2.5-Math-1.5B-Instruct, making it proficient in handling mathematical problems and queries.
  • Robust Code Generation: Incorporates features from Qwen/Qwen2.5-Coder-1.5B-Instruct, providing strong performance in generating and understanding code.
  • Instruction Following: Built on the Qwen/Qwen2.5-1.5B-Instruct base, ensuring good instruction-following capabilities for a wide range of tasks.

Good for

  • Combined Math and Coding Applications: Ideal for use cases that require a single model to perform well in both mathematical problem-solving and code-related tasks.
  • Resource-Constrained Environments: As a 1.5 billion parameter model, it offers a balance of performance and efficiency, suitable for deployment where larger models might be impractical.

This model is a result of merging Qwen/Qwen2.5-Coder-1.5B-Instruct and Qwen/Qwen2.5-Math-1.5B-Instruct with equal weighting, normalized and configured for float16 precision.