Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Jan 5, 2024Architecture:Transformer0.0K Warm

Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct is a 1.1 billion parameter instruction-tuned causal language model based on TinyLlama-1.1B-32k. Developed by Doctor-Shotgun, this model is primarily intended for speculative decoding. It was fine-tuned on various open-source instruct datasets, making it suitable for conversational AI applications.

Loading preview...

Overview

Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct is an instruction-tuned variant of the TinyLlama-1.1B-32k base model, developed by Doctor-Shotgun. This 1.1 billion parameter model is specifically designed and optimized for speculative decoding, a technique used to accelerate inference in large language models. It underwent a full fine-tuning process for 3 epochs on a single A100 GPU, taking approximately 3.5 hours, utilizing several open-source instruct datasets.

Key Capabilities

  • Instruction Following: Fine-tuned on diverse instruct datasets to respond to user prompts effectively.
  • Speculative Decoding: Optimized for use in scenarios where speculative decoding can enhance performance.
  • Compact Size: At 1.1 billion parameters, it offers a smaller footprint compared to larger models, potentially enabling more efficient deployment.

Usage and Limitations

The model uses a modified multi-turn Alpaca instruction format for prompting, making it straightforward to integrate into existing workflows. Users should be aware that the model inherits biases from its base model and has not undergone ethical alignment to prevent the generation of toxic or harmful outputs; in fact, it includes examples from toxic-DPO. Therefore, users should exercise caution and implement their own safeguards when deploying this model in sensitive applications.