Name: TheBloke/Kimiko-7B-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TheBloke

Kimiko 7B - FP16 Overview

Kimiko 7B is a 7 billion parameter model developed by nRuaif, based on the LLaMA2 architecture, and made available in an unquantized FP16 PyTorch format by TheBloke. This version is optimized for GPU inference and serves as a base for further conversions.

Key Capabilities

Instruction Following: Trained on 3,000 examples from diverse instruction datasets (LIMAERP, LIMA, Airboro).
High-Quality Roleplay: Specifically fine-tuned for engaging in detailed roleplay scenarios.
Standard Context Window: Supports a context length of 4096 tokens.
FP16 Format: Provided in float16 PyTorch format, suitable for direct GPU use or as a source for other quantizations (GPTQ, GGML).

Training Details

The model underwent 3 epochs of training with a learning rate of 0.0002, utilizing a full 4096 context token window and LoRA. The training was conducted on a single L4 GPU on GCP for approximately 8 hours, resulting in an estimated carbon emission of 0.2KG.

Prompt Format

Users should follow a specific prompt template for optimal performance:

<<HUMAN>>
{prompt}

<<AIBOT>>

For roleplay, a system prompt can be used:

<<SYSTEM>>
A's Persona:
B's Persona:
Scenario:
Add some instruction here on how you want your RP to go.

Limitations

Inherits biases from its LLaMA2 base, with an explicit mention of an exception for NSFW bias.

Overview

Kimiko 7B - FP16 Overview

Key Capabilities

Training Details

Prompt Format

Limitations

Full Model Card (README)