KaeriJenti/kaori-34b-v3

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 22, 2023License:llama2Architecture:Transformer Open Weights Cold

KaeriJenti/kaori-34b-v3 is a 34 billion parameter language model fine-tuned by Kaeri and Jenti. It was developed using a Supervised Fine-Tuning (SFT) strategy on a combination of Open-Platypus and Dolphin datasets. This model is optimized for general language tasks, with specific attention to avoiding data contamination from common benchmark datasets like GSM8k and ARC.

Loading preview...

kaori-34b-v3 Overview

KaeriJenti/kaori-34b-v3 is a 34 billion parameter language model developed through a collaborative effort by Kaeri and Jenti. This model was fine-tuned using a Supervised Fine-Tuning (SFT) approach, leveraging a dataset composition primarily from Open-Platypus (100%) and a smaller portion from Dolphin (5%). The development process specifically excluded GSM8k samples and implemented rigorous similarity filtering to prevent data contamination from various benchmark tasks, including cot_gsm8k, drop, winogrande, ai2_arc, and hellaswag.

Key Capabilities

  • General Language Understanding: Designed for a broad range of language-based tasks.
  • Contamination-Aware Training: Trained with explicit measures to avoid overfitting to common academic benchmarks, aiming for more robust generalization.

Training Details

The model was fine-tuned using the LLaMA-Factory framework with a LoRA (Low-Rank Adaptation) strategy. The training involved 3 epochs with a batch size of 8, utilizing four A100 GPUs (80GB each).