arif-butt/tinyllama-unsloth-merged
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Mar 25, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
arif-butt/tinyllama-unsloth-merged is a 1.1 billion parameter TinyLlama model, fine-tuned and fully merged using Unsloth optimizations. This model offers 2-3x faster inference and 30-50% less memory usage compared to standard models, making it highly efficient. It is a standalone model, loadable directly without PEFT, and is optimized for general text generation tasks.
Loading preview...
TinyLlama Unsloth Merged: Optimized 1.1B Model
This model is a fully merged version of TinyLlama (1.1B parameters), fine-tuned with LoRA adapters and optimized using Unsloth kernels. Unlike models requiring separate adapter files, this version is standalone and can be loaded directly with the transformers library or Unsloth for enhanced performance.
Key Capabilities & Features:
- Fully Merged: No external PEFT adapters are needed, simplifying deployment.
- Unsloth Optimized: Achieves 2-3x faster inference speeds due to specialized Unsloth kernels.
- Memory Efficient: Utilizes 30-50% less memory compared to standard models, ideal for resource-constrained environments.
- Standalone Operation: Ready to use out-of-the-box, compatible with
transformersand Unsloth. - Base Model: Built upon
TinyLlama/TinyLlama-1.1B-Chat-v1.0. - Precision: Operates in FP16 (float16) for balanced performance and memory use.
Good for:
- Developers seeking a small, efficient language model for rapid prototyping or deployment.
- Applications requiring faster inference and reduced memory footprint on consumer-grade hardware.
- General text generation tasks where the 1.1 billion parameter size is sufficient.