RJTPP/scot0402s-deepseek-1.5b-full

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 7, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

RJTPP/scot0402s-deepseek-1.5b-full is a 1.5 billion parameter Qwen2 model, finetuned by RJTPP. This model was trained using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general language tasks, leveraging the DeepSeek-R1-Distill-Qwen-1.5B base architecture.

Loading preview...

Model Overview

RJTPP/scot0402s-deepseek-1.5b-full is a 1.5 billion parameter language model, finetuned by RJTPP. It is based on the DeepSeek-R1-Distill-Qwen-1.5B architecture, indicating its foundation in the Qwen2 model family.

Key Characteristics

  • Efficient Training: This model was finetuned with Unsloth and Huggingface's TRL library, which facilitated a 2x faster training process compared to standard methods.
  • Parameter Count: With 1.5 billion parameters, it offers a balance between performance and computational efficiency.
  • Context Length: The model supports a context length of 32768 tokens, allowing it to process and generate longer sequences of text.

Potential Use Cases

This model is suitable for a variety of general language understanding and generation tasks where a compact yet capable model is desired. Its efficient training methodology suggests it could be a good candidate for applications requiring rapid iteration or deployment on resource-constrained environments.