agentlans/Qwen2.5-0.5B-Instruct-CrashCourse-dropout
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Jan 1, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The agentlans/Qwen2.5-0.5B-Instruct-CrashCourse-dropout is a 0.5 billion parameter instruction-tuned causal language model, fine-tuned by agentlans from the Qwen/Qwen2.5-0.5B-Instruct base model. It is specifically adapted for enhanced performance on instructional and multitask scenarios, leveraging datasets like 'agentlans/crash-course' and 'vicgalle/configurable-system-prompt-multitask'. This model excels at answering questions related to crash course materials and handling diverse instruction formats within a 32768 token context length.

Loading preview...

Overview

This model, agentlans/Qwen2.5-0.5B-Instruct-CrashCourse-dropout, is a fine-tuned variant of the Qwen/Qwen2.5-0.5B-Instruct base model, developed by agentlans. It has been specifically adapted to improve its capabilities in handling diverse instructional and multitask scenarios. The fine-tuning process utilized two distinct datasets: agentlans/crash-course and vicgalle/configurable-system-prompt-multitask, aiming to enhance its instruction-following and configurable system prompt capabilities.

Key Capabilities

  • Instruction Following: Designed to respond effectively to various instruction formats.
  • Crash Course Material Q&A: Optimized for answering questions related to crash course content.
  • Multitask Scenarios: Capable of handling configurable system prompts for diverse tasks.

Performance and Limitations

Despite its specialized fine-tuning, the model's benchmark results on the Open LLM Leaderboard show a slight decrease in average performance (7.74%) compared to its base model (8.38%). However, it demonstrates a notable improvement in GPQA (0-shot) at 1.79% versus the base model's 1.23%. The developers note that while benchmarks are lower, the model appears to handle slightly complex prompts adequately, suggesting further fine-tuning potential. Users should be aware of potential biases from training data and the possibility of inaccurate outputs, as with all language models.