ludis/tsukasa-13b-qlora-limarp is a 13 billion parameter language model based on the Llama-2 architecture, fine-tuned using QLoRA. This model was trained on a sequence of specialized datasets including Koishi, Pippa, Geepeetee4, and a filtered version of Limarp. It is optimized for conversational interactions using specific system, user, and model tokens, making it suitable for dialogue-based applications.
Loading preview...
Model Overview
ludis/tsukasa-13b-qlora-limarp is a 13 billion parameter language model built upon the Llama-2-13b-hf base architecture. It has undergone a multi-stage fine-tuning process using QLoRA to enhance its conversational capabilities and response generation.
Training Details
The model's training involved sequential fine-tuning on several distinct datasets, each for one or two epochs:
- Koishi dataset (1 epoch)
- Pippa dataset (1 epoch)
- Geepeetee4 dataset (1 epoch)
- Limarp dataset (2 epochs) - specifically a filtered version from 2023-09-14, excluding ponyville, lolicit, and all fallen subsets.
Prompting and Usage
This model is designed for conversational interactions using a specific token-based prompting structure. It utilizes three distinct roles:
<|system|>: For injecting background or out-of-channel information.<|user|>: To denote user input.<|model|>: To indicate where the model should generate its response.
These tokens can be chained to form complex conversation histories, enabling structured dialogue generation. Recommended prompting guidelines and generation settings are available here.