TheBloke/Kimiko-7B-fp16 is an unquantized 7 billion parameter LLaMA2-based model, created by nRuaif and provided in FP16 PyTorch format by TheBloke. This model is specifically fine-tuned for high-quality roleplay and instruction-following, trained on 3,000 examples from LIMAERP, LIMA, and Airboro datasets. It offers a 4096-token context length and is suitable for GPU inference and further conversions.
Loading preview...
Kimiko 7B - FP16 Overview
Kimiko 7B is a 7 billion parameter model developed by nRuaif, based on the LLaMA2 architecture, and made available in an unquantized FP16 PyTorch format by TheBloke. This version is optimized for GPU inference and serves as a base for further conversions.
Key Capabilities
- Instruction Following: Trained on 3,000 examples from diverse instruction datasets (LIMAERP, LIMA, Airboro).
- High-Quality Roleplay: Specifically fine-tuned for engaging in detailed roleplay scenarios.
- Standard Context Window: Supports a context length of 4096 tokens.
- FP16 Format: Provided in float16 PyTorch format, suitable for direct GPU use or as a source for other quantizations (GPTQ, GGML).
Training Details
The model underwent 3 epochs of training with a learning rate of 0.0002, utilizing a full 4096 context token window and LoRA. The training was conducted on a single L4 GPU on GCP for approximately 8 hours, resulting in an estimated carbon emission of 0.2KG.
Prompt Format
Users should follow a specific prompt template for optimal performance:
<<HUMAN>>
{prompt}
<<AIBOT>>For roleplay, a system prompt can be used:
<<SYSTEM>>
A's Persona:
B's Persona:
Scenario:
Add some instruction here on how you want your RP to go.Limitations
Inherits biases from its LLaMA2 base, with an explicit mention of an exception for NSFW bias.