CHIH-HUNG/llama-2-13b-FINETUNE3_3.3w-r4-q_k_v_o
CHIH-HUNG/llama-2-13b-FINETUNE3_3.3w-r4-q_k_v_o is a 13 billion parameter Llama-2-based language model fine-tuned by CHIH-HUNG using the huangyt/FINETUNE3 dataset, comprising approximately 33,000 data points. This model leverages LoRA (rank 16) for efficient fine-tuning on q_proj, k_proj, v_proj, and o_proj layers, and is optimized for general language understanding tasks. It demonstrates competitive performance across benchmarks like ARC, HellaSwag, MMLU, and TruthfulQA, making it suitable for applications requiring robust reasoning and knowledge recall.
Loading preview...
Model Overview
This model, CHIH-HUNG/llama-2-13b-FINETUNE3_3.3w-r4-q_k_v_o, is a 13 billion parameter language model built upon the Llama-2 architecture. It was fine-tuned by CHIH-HUNG using the huangyt/FINETUNE3 dataset, which contains approximately 33,000 training examples. The fine-tuning process utilized LoRA (Low-Rank Adaptation) with a rank of 16, targeting the q_proj, k_proj, v_proj, and o_proj attention layers.
Fine-Tuning Details
- Base Model:
meta-llama/Llama-2-13b-hf - Dataset:
huangyt/FINETUNE3(approx. 33,000 entries) - PEFT Type: LoRA (rank 16)
- Target Layers:
q_proj,k_proj,v_proj,o_proj - Training: 1 epoch,
bf16precision,load_in_4bitquantization,4e-4learning rate. - Training Loss: 0.579
Performance Benchmarks
The model's performance was evaluated on four key benchmarks: ARC, HellaSwag, MMLU, and TruthfulQA. Local evaluations (using load_in_8bit) show an average score of 56.29, with specific scores of 54.27 (ARC), 79.42 (HellaSwag), 51.90 (MMLU), and 39.58 (TruthfulQA). When evaluated against the HuggingFaceH4/open_llm_leaderboard, this specific configuration achieved an average score of 58.34.
Use Cases
This model is suitable for general-purpose language tasks where a Llama-2 13B base model fine-tuned on a diverse dataset is beneficial. Its performance across multiple benchmarks suggests capabilities in reasoning, common sense, and factual recall.