Model Overview
ethicalabs/Kurtis-E1.1-Qwen2.5-3B-Instruct is a 3.09 billion parameter instruction-tuned model built upon the Qwen2.5 architecture. It has been fine-tuned using the Flower framework, aiming to enhance its performance across a range of general language understanding and reasoning tasks. The model supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.
Key Capabilities & Performance
Evaluated using the LM Evaluation Harness, Kurtis-E1.1-Qwen2.5-3B-Instruct demonstrates solid performance on several academic benchmarks:
- MMLU (Massive Multitask Language Understanding): Achieves an overall accuracy of 65.22% (0-shot) and 66.29% (5-shot), indicating proficiency in diverse knowledge domains including humanities, social sciences, and STEM.
- ARC-Easy: Scores 77.10% accuracy, showcasing its ability in elementary science question answering.
- HellaSwag: Reaches 74.12% normalized accuracy, reflecting its common-sense reasoning capabilities.
- ARC-Challenge: Achieves 44.8% normalized accuracy, demonstrating moderate performance on more difficult science questions.
Good For
- General-purpose instruction following: Its instruction-tuned nature makes it suitable for a wide array of conversational and task-oriented applications.
- Knowledge-intensive tasks: Performance on MMLU suggests utility in applications requiring broad factual knowledge and reasoning.
- Resource-constrained environments: As a 3.09 billion parameter model, it offers a balance of capability and efficiency, making it viable for deployment where larger models might be impractical.