TheBloke/Kimiko-v2-13B-fp16

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:llama2Architecture:Transformer0.0K Open Weights Cold

TheBloke/Kimiko-v2-13B-fp16 is a 13 billion parameter large language model, created by nRuaif and converted to float16 by TheBloke. This model is fine-tuned from Llama-13B and is specifically optimized for normal and erotic roleplay. It uses the Vicuna prompt template and has a context length of 4096 tokens, making it suitable for GPU inference and further conversions.

Loading preview...

Kimiko v2 13B - FP16 Overview

This model is a 13 billion parameter large language model, originally developed by nRuaif and provided by TheBloke in fp16 (float16) format for GPU inference. It is fine-tuned from the Llama-13B architecture and utilizes the Vicuna prompt template.

Key Capabilities

  • Specialized Roleplay: Primarily fine-tuned for normal and erotic roleplay scenarios.
  • Assistant Capabilities: While optimized for roleplay, it can still function as an assistant, though it might not always provide the most helpful responses.
  • Fastchat/ShareGPT Format: Uses the Fastchat/ShareGPT format for conversations.
  • FP16 Format: Provided in fp16 for efficient GPU inference and as a base for further model conversions.

Training Details

The model was trained on 3000 conversations with a 4090 token cutoff length, using QLoRA and BF16 mixed precision on a single A100 GPU for 2 hours. Due to a significant percentage of NSFW data in its training set, the model may exhibit a bias towards NSFW content.

Good For

  • Roleplay Applications: Ideal for applications requiring specialized roleplay interactions, both normal and erotic.
  • GPU Inference: Suitable for deployment on GPUs where fp16 precision is desired.
  • Further Conversions: Can serve as a base model for additional quantization or fine-tuning efforts.