Abhaykoul/Qwen1.5-0.5B-vortex
TEXT GENERATIONConcurrency Cost:1Model Size:0.6BQuant:BF16Ctx Length:32kPublished:Mar 11, 2024License:tongyi-qianwen-researchArchitecture:Transformer0.0K Warm
Abhaykoul/Qwen1.5-0.5B-vortex is a 0.6 billion parameter Qwen1.5-based chat model, fine-tuned by Abhaykoul. This model is a dealigned chat finetune of the original Qwen1.5-0.5B, trained on the Vortex mini dataset. It offers a compact solution for chat-oriented applications, maintaining competitive performance across various benchmarks for its size. Its primary strength lies in providing a small, efficient chat model derived from the Qwen family.
Loading preview...
Abhaykoul/Qwen1.5-0.5B-vortex: Dealigned Chat Finetune
This model, developed by Abhaykoul, is a 0.6 billion parameter chat-finetuned variant of the Qwen1.5-0.5B base model. It has been specifically 'dealigned' and trained on the Vortex mini dataset over 5 epochs using axolotl, aiming for specialized chat interactions.
Key Characteristics & Performance
- Base Model: Derived from the robust Qwen1.5-0.5B architecture.
- Training: Fine-tuned for chat on the Vortex mini dataset.
- Parameter Count: A compact 0.6 billion parameters, making it efficient for deployment.
- Context Length: Supports a substantial context window of 32768 tokens.
- Benchmark Performance: While being a dealigned version, it maintains competitive average scores compared to its base model and other 0.5B variants across benchmarks like ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8k.
Good For
- Resource-constrained chat applications: Its small size makes it suitable for environments where computational resources are limited.
- Experimental chat deployments: Ideal for exploring dealigned chat model behaviors or specific conversational styles.
- Rapid prototyping: Enables quick iteration for chat-based features due to its efficiency.