Name: CHIH-HUNG/llama-2-13b-FINETUNE3_3.3w-r4-q_k_v_o API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Model Overview

This model, CHIH-HUNG/llama-2-13b-FINETUNE3_3.3w-r4-q_k_v_o, is a 13 billion parameter language model built upon the Llama-2 architecture. It was fine-tuned by CHIH-HUNG using the huangyt/FINETUNE3 dataset, which contains approximately 33,000 training examples. The fine-tuning process utilized LoRA (Low-Rank Adaptation) with a rank of 16, targeting the q_proj, k_proj, v_proj, and o_proj attention layers.

Fine-Tuning Details

Base Model: meta-llama/Llama-2-13b-hf
Dataset: huangyt/FINETUNE3 (approx. 33,000 entries)
PEFT Type: LoRA (rank 16)
Target Layers: q_proj, k_proj, v_proj, o_proj
Training: 1 epoch, bf16 precision, load_in_4bit quantization, 4e-4 learning rate.
Training Loss: 0.579

Performance Benchmarks

The model's performance was evaluated on four key benchmarks: ARC, HellaSwag, MMLU, and TruthfulQA. Local evaluations (using load_in_8bit) show an average score of 56.29, with specific scores of 54.27 (ARC), 79.42 (HellaSwag), 51.90 (MMLU), and 39.58 (TruthfulQA). When evaluated against the HuggingFaceH4/open_llm_leaderboard, this specific configuration achieved an average score of 58.34.

Use Cases

This model is suitable for general-purpose language tasks where a Llama-2 13B base model fine-tuned on a diverse dataset is beneficial. Its performance across multiple benchmarks suggests capabilities in reasoning, common sense, and factual recall.

Overview

Model Overview

Fine-Tuning Details

Performance Benchmarks

Use Cases

Full Model Card (README)