KaeriJenti/Kaori-34b-v2

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 21, 2023License:llama2Architecture:Transformer Open Weights Cold

Kaori-34b-v2 is a 34 billion parameter language model developed by Kaeri and Jenti, fine-tuned using the LoRA method on a diverse dataset including Open-Platypus, Dolphin, and OpenOrca. This model emphasizes data cleanliness by meticulously filtering out contamination from common benchmark tasks like GSM8k and ARC. It is designed for general language generation tasks where data integrity and avoidance of benchmark contamination are priorities.

Loading preview...

Kaori-34b-v2: A Contamination-Filtered 34B Language Model

Kaori-34b-v2 is a 34 billion parameter language model developed by Kaeri and Jenti, distinguished by its rigorous approach to data cleanliness during fine-tuning. The model was trained using the LoRA method over 3 epochs on A100 GPUs.

Key Characteristics

  • Fine-tuning Datasets: Utilizes a blend of Open-Platypus (100%), Dolphin (5%), and OpenOrca (10%) datasets, applying a Supervised Fine-Tuning (SFT) strategy.
  • Contamination Filtering: A significant focus was placed on preventing data contamination. The training data was carefully similarity-filtered against common benchmark tasks such as GSM8k, ARC, Winogrande, and HellaSwag to ensure robust and unbiased performance evaluation.
  • Training Framework: Fine-tuned using the LLaMA-Factory framework.

Good For

  • Applications requiring a large language model with a strong emphasis on clean training data, free from common benchmark contamination.
  • General language generation and understanding tasks where the integrity of evaluation against standard benchmarks is crucial.