Kimiko v2 13B - FP16 Overview
This model is a 13 billion parameter large language model, originally developed by nRuaif and provided by TheBloke in fp16 (float16) format for GPU inference. It is fine-tuned from the Llama-13B architecture and utilizes the Vicuna prompt template.
Key Capabilities
- Specialized Roleplay: Primarily fine-tuned for normal and erotic roleplay scenarios.
- Assistant Capabilities: While optimized for roleplay, it can still function as an assistant, though it might not always provide the most helpful responses.
- Fastchat/ShareGPT Format: Uses the Fastchat/ShareGPT format for conversations.
- FP16 Format: Provided in
fp16 for efficient GPU inference and as a base for further model conversions.
Training Details
The model was trained on 3000 conversations with a 4090 token cutoff length, using QLoRA and BF16 mixed precision on a single A100 GPU for 2 hours. Due to a significant percentage of NSFW data in its training set, the model may exhibit a bias towards NSFW content.
Good For
- Roleplay Applications: Ideal for applications requiring specialized roleplay interactions, both normal and erotic.
- GPU Inference: Suitable for deployment on GPUs where
fp16 precision is desired. - Further Conversions: Can serve as a base model for additional quantization or fine-tuning efforts.