CHIH-HUNG/llama-2-13b-FINETUNE2_3w-gate_up_down_proj

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Sep 1, 2023License:llama2Architecture:Transformer Open Weights Cold

CHIH-HUNG/llama-2-13b-FINETUNE2_3w-gate_up_down_proj is a 13 billion parameter Llama-2-based language model fine-tuned by CHIH-HUNG using the huangyt/FINETUNE2 dataset, comprising approximately 30,000 data points. This model specifically targets the 'gate_proj', 'up_proj', and 'down_proj' attention layers for LoRA fine-tuning. It demonstrates improved performance on the MMLU and TruthfulQA benchmarks compared to the base Llama-2-13b model, making it suitable for tasks requiring enhanced reasoning and factual accuracy.

Loading preview...

Model Overview

CHIH-HUNG/llama-2-13b-FINETUNE2_3w-gate_up_down_proj is a 13 billion parameter language model built upon the meta-llama/Llama-2-13b-hf architecture. It has been fine-tuned by CHIH-HUNG using the huangyt/FINETUNE2 dataset, which consists of approximately 30,000 training examples.

Fine-Tuning Details

The fine-tuning process utilized LoRA (Low-Rank Adaptation) with a rank of 8, specifically targeting the gate_proj, up_proj, and down_proj attention projection layers. Training was conducted on a single RTX4090 GPU for one epoch, achieving a train_loss of 0.614. The model was trained with bf16 precision and load_in_4bit quantization.

Performance Highlights

Evaluations against the HuggingFaceH4/open_llm_leaderboard benchmarks show that this fine-tuned model achieves an average score of 58.65. Notably, it surpasses the base meta-llama/Llama-2-13b-hf model in MMLU (55.57 vs 54.34) and TruthfulQA (39.19 vs 34.17) scores, indicating improvements in multi-task language understanding and factual consistency.

Good for

  • Applications requiring enhanced reasoning and factual recall over the base Llama-2-13b model.
  • Tasks where the specific fine-tuning on the huangyt/FINETUNE2 dataset aligns with the domain or style of desired outputs.
  • Developers looking for a Llama-2 variant with targeted improvements in specific benchmark categories.