ethicalabs/Kurtis-E1.1-Qwen2.5-3B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 30, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

The ethicalabs/Kurtis-E1.1-Qwen2.5-3B-Instruct is a 3.09 billion parameter instruction-tuned causal language model based on the Qwen2.5 architecture, fine-tuned with the Flower framework. This model demonstrates strong performance across various general knowledge and reasoning benchmarks, including MMLU (65.22%) and ARC-Easy (77.10%). With a context length of 32768 tokens, it is suitable for applications requiring robust understanding and generation of text in a compact form factor.

Loading preview...

Model Overview

ethicalabs/Kurtis-E1.1-Qwen2.5-3B-Instruct is a 3.09 billion parameter instruction-tuned model built upon the Qwen2.5 architecture. It has been fine-tuned using the Flower framework, aiming to enhance its performance across a range of general language understanding and reasoning tasks. The model supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Key Capabilities & Performance

Evaluated using the LM Evaluation Harness, Kurtis-E1.1-Qwen2.5-3B-Instruct demonstrates solid performance on several academic benchmarks:

  • MMLU (Massive Multitask Language Understanding): Achieves an overall accuracy of 65.22% (0-shot) and 66.29% (5-shot), indicating proficiency in diverse knowledge domains including humanities, social sciences, and STEM.
  • ARC-Easy: Scores 77.10% accuracy, showcasing its ability in elementary science question answering.
  • HellaSwag: Reaches 74.12% normalized accuracy, reflecting its common-sense reasoning capabilities.
  • ARC-Challenge: Achieves 44.8% normalized accuracy, demonstrating moderate performance on more difficult science questions.

Good For

  • General-purpose instruction following: Its instruction-tuned nature makes it suitable for a wide array of conversational and task-oriented applications.
  • Knowledge-intensive tasks: Performance on MMLU suggests utility in applications requiring broad factual knowledge and reasoning.
  • Resource-constrained environments: As a 3.09 billion parameter model, it offers a balance of capability and efficiency, making it viable for deployment where larger models might be impractical.