Juhaann20/DeepSeek-R1-Distill-Qwen-7B-LoRA-Task

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 9, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Juhaann20/DeepSeek-R1-Distill-Qwen-7B-LoRA-Task is a 7.6 billion parameter Qwen2-based causal language model developed by Juhaann20. This model was fine-tuned using Unsloth and Huggingface's TRL library, enabling faster training. It is designed for general language tasks, leveraging its Qwen2 architecture and efficient fine-tuning process.

Loading preview...

Model Overview

Juhaann20/DeepSeek-R1-Distill-Qwen-7B-LoRA-Task is a 7.6 billion parameter language model based on the Qwen2 architecture. It was developed by Juhaann20 and fine-tuned from the unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit model.

Key Characteristics

  • Architecture: Qwen2-based causal language model.
  • Parameter Count: 7.6 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Efficient Fine-tuning: The model was fine-tuned using Unsloth and Huggingface's TRL library, which facilitated a 2x faster training process.

Use Cases

This model is suitable for a variety of general language understanding and generation tasks, benefiting from its Qwen2 foundation and efficient fine-tuning. Its 32K context window makes it capable of processing longer inputs and generating more coherent, extended responses.