Haldi247/TinyLlama-SFT-Alpaca
Haldi247/TinyLlama-SFT-Alpaca is a 1.1 billion parameter causal language model developed by Hadeeqa Al Islam, fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0. It was trained using Supervised Fine-Tuning (SFT) with LoRA on the yahma/alpaca-cleaned dataset, featuring a 2048-token context length. This model is optimized for instruction-following tasks, demonstrating an average BLEU score of 0.4303 and BERTScore of 0.7236 on evaluation metrics.
Loading preview...
TinyLlama-SFT-Alpaca Overview
Haldi247/TinyLlama-SFT-Alpaca is a 1.1 billion parameter causal language model developed by Hadeeqa Al Islam. It is a supervised fine-tuned (SFT) version of TinyLlama/TinyLlama-1.1B-Chat-v1.0, utilizing LoRA on the yahma/alpaca-cleaned dataset, which comprises 20,000 filtered samples. The training was conducted with a learning rate of 2e-4 over 2 epochs, taking approximately 14 minutes on an NVIDIA RTX 5070 Ti.
Key Capabilities
- Instruction Following: Fine-tuned specifically for responding to instructions based on the Alpaca dataset.
- Compact Size: At 1.1 billion parameters, it offers a smaller footprint for deployment compared to larger models.
- Evaluated Performance: Achieved an average BLEU score of 0.4303 and an average BERTScore of 0.7236 on evaluation metrics.
Good for
- Lightweight Instruction-Tuning: Ideal for applications requiring a small, instruction-tuned model.
- Educational Projects: Suitable for learning and experimenting with SFT techniques on smaller models.
- Resource-Constrained Environments: Its compact size makes it viable for deployment where computational resources are limited.
Limitations
- Tokenization Constraints: May exhibit formatting issues if the inference environment does not strictly adhere to the expected chat template due to sequence packing during training.
- Data Bias: Inherits potential biases from the synthetic instruction data present in the Alpaca dataset.
- Inference Stability: Users might observe repetition or formatting artifacts if prompt structures deviate significantly from the training format.