KaeriJenti/Kaori-34B-v1

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 19, 2023License:llama2Architecture:Transformer Open Weights Cold

Kaori-34B-v1 is a 34 billion parameter language model developed by Kaeri and Jenti, fine-tuned using LoRA on a combination of Open-Platypus and Dolphin datasets. This model is specifically optimized for general instruction following, with a focus on avoiding data contamination from common benchmark tasks. It offers a 32768 token context length, making it suitable for applications requiring extensive conversational memory or document processing.

Loading preview...

Kaori-34B-v1: An Instruction-Tuned Language Model

Kaori-34B-v1 is a 34 billion parameter language model developed by Kaeri and Jenti, fine-tuned using the LoRA method. It leverages a strategic combination of the Open-Platypus and Dolphin datasets, with a primary focus on Open-Platypus data (100%) supplemented by 5% Dolphin data, to enhance its instruction-following capabilities.

Key Characteristics

  • Fine-tuning Strategy: Utilizes Supervised Fine-Tuning (SFT) with LoRA for efficient adaptation.
  • Data Contamination Prevention: The training process meticulously filtered out samples corresponding to common benchmark tasks like GSM8k, DROP, WinoGrande, ARC, and HellaSwag to ensure robust and unbiased performance evaluation.
  • Training Environment: Trained over 3 epochs on A100x4 (80GB) GPUs with a batch size of 8, using the LLaMA-Factory framework.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling it to handle longer inputs and maintain coherence over extended interactions.

Ideal Use Cases

  • General Instruction Following: Excels in tasks requiring precise adherence to given instructions.
  • Conversational AI: Its large context window makes it suitable for maintaining long-form conversations.
  • Applications Requiring Benchmark Robustness: Designed to perform well on tasks without being overfit to common benchmarks, offering a more generalized understanding.