TheBloke/Kimiko-v2-13B-fp16
TheBloke/Kimiko-v2-13B-fp16 is a 13 billion parameter large language model, created by nRuaif and converted to float16 by TheBloke. This model is fine-tuned from Llama-13B and is specifically optimized for normal and erotic roleplay. It uses the Vicuna prompt template and has a context length of 4096 tokens, making it suitable for GPU inference and further conversions.
Loading preview...
Kimiko v2 13B - FP16 Overview
This model is a 13 billion parameter large language model, originally developed by nRuaif and provided by TheBloke in fp16 (float16) format for GPU inference. It is fine-tuned from the Llama-13B architecture and utilizes the Vicuna prompt template.
Key Capabilities
- Specialized Roleplay: Primarily fine-tuned for normal and erotic roleplay scenarios.
- Assistant Capabilities: While optimized for roleplay, it can still function as an assistant, though it might not always provide the most helpful responses.
- Fastchat/ShareGPT Format: Uses the Fastchat/ShareGPT format for conversations.
- FP16 Format: Provided in
fp16for efficient GPU inference and as a base for further model conversions.
Training Details
The model was trained on 3000 conversations with a 4090 token cutoff length, using QLoRA and BF16 mixed precision on a single A100 GPU for 2 hours. Due to a significant percentage of NSFW data in its training set, the model may exhibit a bias towards NSFW content.
Good For
- Roleplay Applications: Ideal for applications requiring specialized roleplay interactions, both normal and erotic.
- GPU Inference: Suitable for deployment on GPUs where
fp16precision is desired. - Further Conversions: Can serve as a base model for additional quantization or fine-tuning efforts.