Azure99/blossom-v4-qwen1_5-4b

Loading
Public
4B
BF16
32768
1
Feb 19, 2024
License: apache-2.0
Hugging Face
Overview

Blossom-v4-qwen1_5-4b Overview

Blossom-v4-qwen1_5-4b is a 4 billion parameter conversational language model developed by Azure99, built upon the Qwen1.5-4B pre-trained model. It has been instruction-tuned using a proprietary Blossom Orca/Wizard/Chat/Math hybrid dataset, designed to enhance its general capabilities and contextual understanding. The model leverages high-quality Chinese and English datasets, which are also open-sourced.

Key Capabilities

  • Conversational AI: Optimized for engaging in both single-turn and multi-turn dialogues.
  • Context Understanding: Demonstrates strong ability to comprehend and maintain context across conversations.
  • General Purpose: Possesses robust general capabilities, making it suitable for a variety of conversational tasks.
  • Multilingual Support: Trained on high-quality Chinese and English datasets.

Training Methodology

The model underwent a two-stage instruction tuning process:

  1. Stage One: Trained for one epoch on a dataset comprising 100K Wizard, 100K Orca, and 20K Math single-turn instruction data.
  2. Stage Two: Trained for three epochs using 50K Blossom chat multi-turn conversational data, supplemented with a 2% random sample from the first stage's dataset.

Usage Notes

For inference, the model expects a specific chat format, where |Bot| outputs in historical turns are terminated with <|endoftext|>. This ensures proper multi-turn conversation flow.