gshasiri/SmolLM3-SFT

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Nov 17, 2025Architecture:Transformer Warm

SmolLM3-SFT by gshasiri is a 1 billion parameter instruction-tuned causal language model, fine-tuned from gshasiri/SmolLM3-Mid using the TRL framework. This model is optimized for conversational AI and instruction following, leveraging its compact size for efficient deployment. With a 32768 token context length, it is suitable for applications requiring processing of longer prompts and generating coherent, extended responses.

Loading preview...

Overview

SmolLM3-SFT is a 1 billion parameter instruction-tuned language model developed by gshasiri. It is a fine-tuned variant of the gshasiri/SmolLM3-Mid base model, specifically trained using the TRL (Transformer Reinforcement Learning) framework to enhance its instruction-following capabilities.

Key Capabilities

  • Instruction Following: Optimized for understanding and responding to user instructions and prompts.
  • Conversational AI: Designed to generate coherent and contextually relevant text in dialogue-based scenarios.
  • Extended Context: Supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Training Details

The model underwent Supervised Fine-Tuning (SFT) using the TRL library. This process involved training on specific datasets to align its outputs with human instructions and preferences. The training utilized TRL: 0.25.0, Transformers: 4.57.1, and Pytorch: 2.6.0+cu126.