Name: KaeriJenti/kaori-34b-v4 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: KaeriJenti

KaeriJenti/kaori-34b-v4 Overview

KaeriJenti/kaori-34b-v4 is a 34 billion parameter language model developed through supervised fine-tuning (SFT) by Kaeri and Jenti. The model was trained using a LoRA finetuning type over 3 epochs on A100 GPUs.

Training Details

The model's training regimen focused on a specific blend of datasets:

Open-Platypus: 100% of this dataset was utilized.
Dolphin: 5% of this dataset was incorporated.
OpenOrca: 10% of this dataset was used.

Data Contamination Filtering

A key aspect of kaori-34b-v4's development involved rigorous data contamination filtering. The creators explicitly excluded GSM8k samples and applied similarity filtering against a list of common benchmark tasks, including cot_gsm8k, drop, winogrande, ai2_arc, and hellaswag. This careful approach aims to ensure the model's performance is not artificially inflated by exposure to benchmark data during training.

Use Cases

This model is well-suited for applications requiring a general-purpose instruction-following language model, particularly where the integrity of evaluation against standard benchmarks is a concern due to the deliberate filtering of common benchmark datasets from its training data.

Overview

KaeriJenti/kaori-34b-v4 Overview

Training Details

Data Contamination Filtering

Use Cases

Full Model Card (README)