Name: CHIH-HUNG/llama-2-13b-FINETUNE2_3w-gate_up_down_proj API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Model Overview

CHIH-HUNG/llama-2-13b-FINETUNE2_3w-gate_up_down_proj is a 13 billion parameter language model built upon the meta-llama/Llama-2-13b-hf architecture. It has been fine-tuned by CHIH-HUNG using the huangyt/FINETUNE2 dataset, which consists of approximately 30,000 training examples.

Fine-Tuning Details

The fine-tuning process utilized LoRA (Low-Rank Adaptation) with a rank of 8, specifically targeting the gate_proj, up_proj, and down_proj attention projection layers. Training was conducted on a single RTX4090 GPU for one epoch, achieving a train_loss of 0.614. The model was trained with bf16 precision and load_in_4bit quantization.

Performance Highlights

Evaluations against the HuggingFaceH4/open_llm_leaderboard benchmarks show that this fine-tuned model achieves an average score of 58.65. Notably, it surpasses the base meta-llama/Llama-2-13b-hf model in MMLU (55.57 vs 54.34) and TruthfulQA (39.19 vs 34.17) scores, indicating improvements in multi-task language understanding and factual consistency.

Good for

Applications requiring enhanced reasoning and factual recall over the base Llama-2-13b model.
Tasks where the specific fine-tuning on the huangyt/FINETUNE2 dataset aligns with the domain or style of desired outputs.
Developers looking for a Llama-2 variant with targeted improvements in specific benchmark categories.

Overview

Model Overview

Fine-Tuning Details

Performance Highlights

Good for

Full Model Card (README)