Shellypeckie/student_qwen3_1p7b_gpqa_self_dolly_seq_kd

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 20, 2026Architecture:Transformer Cold

The Shellypeckie/student_qwen3_1p7b_gpqa_self_dolly_seq_kd model is a 1.7 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B using Supervised Fine-Tuning (SFT) with the TRL framework. This model is designed for general text generation tasks, leveraging its 32K context length for processing longer inputs. Its training methodology focuses on adapting the base Qwen3 architecture for improved conversational and instruction-following capabilities.

Loading preview...

Model Overview

This model, student_qwen3_1p7b_gpqa_self_dolly_seq_kd, is a 1.7 billion parameter language model derived from the Qwen/Qwen3-1.7B base architecture. It has been fine-tuned using Supervised Fine-Tuning (SFT) with the TRL framework, indicating a focus on adapting the model for specific instruction-following or conversational tasks.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-1.7B.
  • Training Method: Utilizes Supervised Fine-Tuning (SFT) for instruction alignment.
  • Framework: Trained using the TRL (Transformers Reinforcement Learning) library.
  • Context Length: Supports a context window of 32,768 tokens, enabling processing of substantial input lengths.

Use Cases

This model is suitable for general text generation tasks where a smaller, efficient model with good instruction-following capabilities is desired. Its fine-tuning process suggests potential for applications requiring:

  • Conversational AI: Generating responses in dialogue systems.
  • Instruction Following: Executing commands or answering questions based on provided instructions.
  • Text Completion: Assisting with creative writing or content generation.