CHIH-HUNG/llama-2-13b-FINETUNE5_4w-r4-q_k_v_o

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Oct 1, 2023License:llama2Architecture:Transformer Open Weights Cold

CHIH-HUNG/llama-2-13b-FINETUNE5_4w-r4-q_k_v_o is a 13 billion parameter Llama-2-based causal language model developed by CHIH-HUNG. It was fine-tuned using LoRA on the huangyt/FINETUNE5 dataset, comprising approximately 40,000 data points. This model demonstrates competitive performance across benchmarks like ARC, HellaSwag, MMLU, and TruthfulQA, making it suitable for general language understanding and generation tasks.

Loading preview...

Overview

This model, CHIH-HUNG/llama-2-13b-FINETUNE5_4w-r4-q_k_v_o, is a 13 billion parameter variant of the Llama-2 architecture. It has been fine-tuned by CHIH-HUNG using the LoRA (Low-Rank Adaptation) method on the huangyt/FINETUNE5 dataset, which contains approximately 40,000 training examples. The fine-tuning process utilized an RTX4090 GPU, with specific parameters including a LoRA rank of 16 targeting q_proj, k_proj, v_proj, and o_proj layers, a learning rate of 4e-4, and training for one epoch in bf16 precision with 4-bit quantization.

Performance Benchmarks

The model's performance was evaluated against the base Llama-2-13b across four standard benchmarks, with scores measured locally using 8-bit quantization:

  • Average Score: 56.09
  • ARC: 54.35
  • HellaSwag: 79.24
  • MMLU: 54.01
  • TruthfulQA: 36.75

These results indicate its capabilities in common reasoning, commonsense, and factual recall tasks. The fine-tuning process achieved a train_loss of 0.579 over a runtime of approximately 4 hours and 6 minutes using DeepSpeed.

Key Differentiators

  • Targeted Fine-tuning: Specialized LoRA fine-tuning on a custom dataset (huangyt/FINETUNE5) for potentially improved performance on tasks aligned with the dataset's content.
  • Efficient Training: Utilizes LoRA with 4-bit quantization and bf16 precision, making it efficient for deployment and further fine-tuning on consumer-grade hardware.

Recommended Use Cases

This model is suitable for applications requiring a Llama-2-13b base with enhanced performance on tasks similar to those found in the huangyt/FINETUNE5 dataset. It can be used for general text generation, question answering, and understanding tasks where a balance between model size and performance is desired.