Hariharan05/Qwen3-1.7B-Distill-Claude

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 19, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Hariharan05/Qwen3-1.7B-Distill-Claude is a 1.7 billion parameter causal language model, fine-tuned from Qwen/Qwen3-1.7B. It is optimized for instruction-following and high-quality response generation, leveraging a distilled Claude-Alpaca dataset. This model is provided in lightweight GGUF formats for efficient local inference and excels at general conversational AI tasks.

Loading preview...

Model Overview

This model, Hariharan05/Qwen3-1.7B-Distill-Claude, is a fine-tuned version of the Qwen/Qwen3-1.7B base model, featuring 1.7 billion parameters. It has been specifically trained to enhance instruction-following capabilities and generate high-quality responses, drawing inspiration from Claude-Alpaca datasets.

Key Features & Training

  • Base Model: Qwen/Qwen3-1.7B, a causal language model.
  • Parameter Count: 1.7 Billion, with 17.4 million trainable parameters via LoRA.
  • Training Data: Fine-tuned on a blend of 30,000 instruction-following examples from Norquinal/WizardLM_alpaca_claude_evol_instruct_70k and AlSamCur123/Alpaca.
  • Optimization: Training utilized EasyFineTuner and Unsloth for accelerated training and optimized memory usage.
  • Format: Available in lightweight GGUF formats (q4_k_m and q5_k_m) and LoRA Adapters, suitable for local inference.
  • Performance: Achieved a final training loss of 1.1782 after 1 epoch, with training completed in approximately 3.3 hours on a single Tesla T4 GPU.

Use Cases

This model is well-suited for applications requiring a compact yet capable instruction-following assistant. Its design for local inference makes it ideal for:

  • General conversational AI.
  • Answering coding questions and explaining technical concepts.
  • Engaging in friendly dialogue where concise yet thorough explanations are valued.