pawin205/Qwen-7B-REMOR-SFT-no-think

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 15, 2026Architecture:Transformer Cold

The pawin205/Qwen-7B-REMOR-SFT-no-think is a 7.6 billion parameter language model, fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B using Supervised Fine-Tuning (SFT) with the TRL library. This model is designed for general text generation tasks, leveraging its Qwen-based architecture and a 32K context length. It provides a robust foundation for applications requiring instruction-following capabilities.

Loading preview...

Model Overview

The pawin205/Qwen-7B-REMOR-SFT-no-think is a 7.6 billion parameter language model, fine-tuned from the deepseek-ai/DeepSeek-R1-Distill-Qwen-7B base model. This fine-tuning process utilized Supervised Fine-Tuning (SFT) with the TRL library, a framework for Transformer Reinforcement Learning.

Key Capabilities

  • Instruction Following: The model is designed to generate text based on user prompts, making it suitable for various instruction-tuned applications.
  • Qwen Architecture: Built upon the Qwen family, it inherits the robust capabilities of its base model.
  • Context Length: Supports a substantial context window of 32,768 tokens, allowing for processing longer inputs and generating more coherent, extended responses.

Training Details

The model was trained using SFT, with the training procedure monitored via Weights & Biases. The development environment included:

  • TRL: 0.24.0
  • Transformers: 4.57.1
  • Pytorch: 2.8.0+cu129
  • Datasets: 4.3.0
  • Tokenizers: 0.22.1

Good For

  • General text generation tasks.
  • Applications requiring a model with a large context window.
  • Developers looking for a Qwen-based model fine-tuned for instruction following.