Name: CHIH-HUNG/llama-2-13b-FINETUNE1_17w-r4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Model Overview

CHIH-HUNG/llama-2-13b-FINETUNE1_17w-r4 is a 13 billion parameter language model built upon the Llama-2-13b architecture. It has been fine-tuned by CHIH-HUNG using the huangyt/FINETUNE1 dataset, which contains approximately 170,000 training examples. The fine-tuning process utilized LoRA (Low-Rank Adaptation) with a rank of 4, specifically targeting the gate_proj, up_proj, and down_proj attention layers.

Fine-Tuning Details

The model was trained for one epoch on an RTX4090 GPU, employing a per_device_train_batch_size of 8 and gradient_accumulation_steps of 8. A learning rate of 5e-5 was used, with training conducted in bf16 precision and 4-bit quantization (load_in_4bit). The training loss achieved was 0.66 over a runtime of approximately 16 hours and 22 minutes.

Performance Evaluation

Evaluation against the HuggingFaceH4/open_llm_leaderboard benchmarks shows that this fine-tuned model generally outperforms the base meta-llama/Llama-2-13b-hf model across several metrics. Specifically, it achieved an average score of 58.71, with notable scores in:

HellaSwag: 82.27
MMLU: 56.18
TruthfulQA: 39.65

These results indicate an improvement in common sense reasoning, multi-task language understanding, and truthfulness compared to the original Llama-2-13b, making it a robust option for various language-based applications.

Overview

Model Overview

Fine-Tuning Details

Performance Evaluation

Full Model Card (README)