sampluralis/llama-sft-muon

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Mar 5, 2026Architecture:Transformer Warm

sampluralis/llama-sft-muon is a fine-tuned language model based on gshasiri/SmolLM3-Mid, developed by sampluralis. This model was trained using Supervised Fine-Tuning (SFT) with the TRL library. It is designed for general text generation tasks, leveraging its fine-tuned base for improved conversational capabilities. The model provides a quick start for text generation pipelines.

Loading preview...

Model Overview

sampluralis/llama-sft-muon is a fine-tuned language model derived from the gshasiri/SmolLM3-Mid base model. Developed by sampluralis, this iteration has undergone Supervised Fine-Tuning (SFT) utilizing the Hugging Face TRL (Transformers Reinforcement Learning) library. The training process aimed to enhance its text generation capabilities, making it suitable for various conversational and generative AI applications.

Key Capabilities

  • General Text Generation: Capable of generating coherent and contextually relevant text based on user prompts.
  • Fine-tuned Performance: Benefits from SFT on its base model, suggesting improved response quality and adherence to instructions.
  • Ease of Use: Provides a straightforward integration with the Hugging Face pipeline for quick deployment in text generation tasks.

Training Details

The model was trained using SFT, leveraging specific versions of key frameworks:

  • TRL: 0.28.0
  • Transformers: 4.57.6
  • Pytorch: 2.6.0+cu126
  • Datasets: 4.6.0
  • Tokenizers: 0.22.2

This fine-tuning approach, combined with the specified framework versions, indicates a focus on stable and reproducible training outcomes. Users can visualize the training process via Weights & Biases, linked in the original repository.