Overview
Overview
RESMPDEV/Qwen1.5-Wukong-0.5B is a 0.6 billion parameter, decoder-only language model, finetuned from the Qwen1.5-0.5B base model. It features a 32K token context length and has been specifically dealigned for chat applications. The model was trained for 3 epochs using the Teknium OpenHermes-2.5 dataset, supplemented by additional datasets from Cognitive Computations.
Key Characteristics
- Base Model: Built upon the Qwen1.5-0.5B architecture, which is a transformer-based decoder-only model.
- Training Data: Utilizes the Teknium OpenHermes-2.5 dataset and proprietary Cognitive Computations datasets.
- Dealigned Finetune: Optimized for chat interactions with a focus on a 'dealigned' approach, suggesting a departure from strict alignment for certain conversational styles.
- Context Length: Supports a stable context length of 32,768 tokens.
Performance
Evaluations on the Open LLM Leaderboard show an average score of 38.15. Specific metric scores include:
- AI2 Reasoning Challenge (25-Shot): 31.74
- HellaSwag (10-Shot): 47.78
- MMLU (5-Shot): 38.44
- TruthfulQA (0-shot): 38.92
- Winogrande (5-shot): 56.51
- GSM8k (5-shot): 15.54
Use Cases
This model is suitable for developers looking for a compact, specialized chat model derived from the Qwen1.5 series, particularly for applications where a 'dealigned' conversational style is desired.