sampluralis/llama-sft-baseline

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Mar 9, 2026Architecture:Transformer Warm

The sampluralis/llama-sft-baseline is a 1 billion parameter language model, fine-tuned from gshasiri/SmolLM3-Mid. Developed by sampluralis, this model was trained using the TRL library for supervised fine-tuning (SFT). It is designed for general text generation tasks, particularly conversational responses, and supports a context length of 32768 tokens.

Loading preview...

Model Overview

The sampluralis/llama-sft-baseline is a 1 billion parameter language model, fine-tuned from the gshasiri/SmolLM3-Mid base model. This model was developed by sampluralis and specifically trained using the Hugging Face TRL (Transformers Reinforcement Learning) library, employing a Supervised Fine-Tuning (SFT) approach. It supports a substantial context length of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended responses.

Key Capabilities

  • General Text Generation: Capable of generating human-like text based on given prompts.
  • Conversational AI: Demonstrated utility in generating responses for interactive questions, as shown in the quick start example.
  • Fine-tuned Performance: Leverages SFT to adapt the base model for improved performance on specific tasks.

Training Details

The model's training procedure utilized the TRL library (version 0.28.0) alongside Transformers (4.57.6), PyTorch (2.6.0+cu126), Datasets (4.6.0), and Tokenizers (0.22.2). The training process can be visualized via Weights & Biases, indicating a structured and tracked development cycle.

Good For

  • Prototyping: Suitable for developers looking for a smaller, fine-tuned model for initial experimentation.
  • Conversational Agents: Can be integrated into applications requiring basic question-answering or dialogue generation.
  • Further Fine-tuning: Serves as a solid baseline for additional domain-specific fine-tuning due to its SFT foundation.