adithash/gemma2b-dolly-qlora-merged
The adithash/gemma2b-dolly-qlora-merged model is a 2.5 billion parameter Gemma-2B-IT variant, fine-tuned by adithash using QLoRA on the Databricks Dolly-15k dataset. This fully merged model integrates the adapter weights directly into the base model, enabling single-step loading without PEFT dependencies. It is optimized for instruction-following tasks and serves as a straightforward solution for learning and experimentation with fine-tuned LLMs.
Loading preview...
Model Overview
This model, adithash/gemma2b-dolly-qlora-merged, is a 2.5 billion parameter instruction-tuned variant of Google's Gemma-2B-IT. It was fine-tuned using QLoRA on the databricks/databricks-dolly-15k dataset, which comprises 14,911 instruction-following samples. A key differentiator is that this is a fully merged model, meaning the QLoRA adapter weights have been fused into the base model. This eliminates the need for PEFT (Parameter-Efficient Fine-Tuning) or separate adapter loading, simplifying deployment and usage compared to adapter-only versions.
Key Capabilities & Features
- Standalone Deployment: No PEFT or base model required at inference time; load and run directly.
- Instruction Following: Fine-tuned specifically for general instruction-following tasks based on the Dolly-15k dataset.
- Simplified Usage: Offers a single-step loading process, making it easy to integrate into projects.
- Memory-Efficient Options: Supports 4-bit quantization for reduced VRAM usage on GPUs.
Intended Use Cases
- Learning & Experimentation: Ideal for understanding and experimenting with QLoRA fine-tuning and merged LLMs.
- Portfolio Demonstration: Useful for showcasing end-to-end QLoRA fine-tuning workflows.
- Starting Point: Can serve as a base for further domain-specific instruction tuning.
Limitations
- Fine-tuned for only 500 steps, limiting its overall performance compared to fully trained models.
- As a 2.5B parameter model, its capacity for complex multi-step reasoning is limited.
- Training sequence length was capped at 256 tokens, which may affect performance on very long prompts.