Haldi247/TinyLlama-SFT-Alpaca

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Jun 5, 2026Architecture:Transformer Cold

Haldi247/TinyLlama-SFT-Alpaca is a 1.1 billion parameter causal language model developed by Hadeeqa Al Islam, fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0. It was trained using Supervised Fine-Tuning (SFT) with LoRA on the yahma/alpaca-cleaned dataset, featuring a 2048-token context length. This model is optimized for instruction-following tasks, demonstrating an average BLEU score of 0.4303 and BERTScore of 0.7236 on evaluation metrics.

Loading preview...

TinyLlama-SFT-Alpaca Overview

Haldi247/TinyLlama-SFT-Alpaca is a 1.1 billion parameter causal language model developed by Hadeeqa Al Islam. It is a supervised fine-tuned (SFT) version of TinyLlama/TinyLlama-1.1B-Chat-v1.0, utilizing LoRA on the yahma/alpaca-cleaned dataset, which comprises 20,000 filtered samples. The training was conducted with a learning rate of 2e-4 over 2 epochs, taking approximately 14 minutes on an NVIDIA RTX 5070 Ti.

Key Capabilities

  • Instruction Following: Fine-tuned specifically for responding to instructions based on the Alpaca dataset.
  • Compact Size: At 1.1 billion parameters, it offers a smaller footprint for deployment compared to larger models.
  • Evaluated Performance: Achieved an average BLEU score of 0.4303 and an average BERTScore of 0.7236 on evaluation metrics.

Good for

  • Lightweight Instruction-Tuning: Ideal for applications requiring a small, instruction-tuned model.
  • Educational Projects: Suitable for learning and experimenting with SFT techniques on smaller models.
  • Resource-Constrained Environments: Its compact size makes it viable for deployment where computational resources are limited.

Limitations

  • Tokenization Constraints: May exhibit formatting issues if the inference environment does not strictly adhere to the expected chat template due to sequence packing during training.
  • Data Bias: Inherits potential biases from the synthetic instruction data present in the Alpaca dataset.
  • Inference Stability: Users might observe repetition or formatting artifacts if prompt structures deviate significantly from the training format.