Model Overview
The sampluralis/llama-sft-baseline is a 1 billion parameter language model, fine-tuned from the gshasiri/SmolLM3-Mid base model. This model was developed by sampluralis and specifically trained using the Hugging Face TRL (Transformers Reinforcement Learning) library, employing a Supervised Fine-Tuning (SFT) approach. It supports a substantial context length of 32768 tokens, allowing for processing longer inputs and generating more coherent, extended responses.
Key Capabilities
- General Text Generation: Capable of generating human-like text based on given prompts.
- Conversational AI: Demonstrated utility in generating responses for interactive questions, as shown in the quick start example.
- Fine-tuned Performance: Leverages SFT to adapt the base model for improved performance on specific tasks.
Training Details
The model's training procedure utilized the TRL library (version 0.28.0) alongside Transformers (4.57.6), PyTorch (2.6.0+cu126), Datasets (4.6.0), and Tokenizers (0.22.2). The training process can be visualized via Weights & Biases, indicating a structured and tracked development cycle.
Good For
- Prototyping: Suitable for developers looking for a smaller, fine-tuned model for initial experimentation.
- Conversational Agents: Can be integrated into applications requiring basic question-answering or dialogue generation.
- Further Fine-tuning: Serves as a solid baseline for additional domain-specific fine-tuning due to its SFT foundation.