haidaridhan/deepseek_instruct_final

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 19, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The haidaridhan/deepseek_instruct_final is a 1.5 billion parameter instruction-tuned Qwen2 model developed by haidaridhan, fine-tuned using Unsloth and Huggingface's TRL library. This model features a 32768 token context length and is optimized for efficient training, making it suitable for applications requiring a compact yet capable language model. Its development with Unsloth suggests a focus on faster fine-tuning and deployment.

Loading preview...

Overview

The haidaridhan/deepseek_instruct_final is a 1.5 billion parameter instruction-tuned Qwen2 model, developed by haidaridhan. It was fine-tuned from unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit using the Unsloth library and Huggingface's TRL library. This combination allowed for significantly faster training, specifically noted as 2x faster.

Key Capabilities

  • Efficient Training: Leverages Unsloth for accelerated fine-tuning, making it resource-efficient for development.
  • Instruction Following: Designed to respond to instructions effectively, suitable for various NLP tasks.
  • Compact Size: At 1.5 billion parameters, it offers a balance between performance and computational footprint.
  • Extended Context: Supports a substantial context length of 32768 tokens, allowing for processing longer inputs.

Good For

  • Developers looking for a compact, instruction-tuned model.
  • Applications where faster fine-tuning and deployment are critical.
  • Tasks requiring a model with a decent context window for processing detailed prompts.