Name: CHIH-HUNG/llama-2-13b-FINETUNE2_3w-q_k_v_o_proj API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Model Overview

CHIH-HUNG/llama-2-13b-FINETUNE2_3w-q_k_v_o_proj is a 13 billion parameter language model built upon the meta-llama/Llama-2-13b-hf architecture. It was fine-tuned by CHIH-HUNG using the huangyt/FINETUNE2 dataset, which contains approximately 30,000 training examples.

Fine-Tuning Details

The fine-tuning process utilized LoRA (Low-Rank Adaptation) with a rank of 8, specifically targeting the q_proj, k_proj, v_proj, and o_proj attention projection layers. Training was conducted for 1 epoch with a learning rate of 5e-5, using bf16 precision and load_in_4bit quantization. The training loss achieved was 0.65 over a runtime of approximately 3 hours and 33 minutes.

Performance Benchmarks

Evaluation against the HuggingFaceH4/open_llm_leaderboard benchmarks shows the model's performance relative to the base Llama-2-13b models. While its average score is slightly higher than the base Llama-2-13b, it shows improvements in HellaSwag and TruthfulQA scores. For instance, it achieved 82.47 on HellaSwag and 37.92 on TruthfulQA, compared to 80.97 and 34.17 respectively for the base Llama-2-13b-hf.

Use Cases

This model is suitable for tasks requiring a Llama-2-13b variant that has undergone specific fine-tuning on a custom dataset, potentially offering specialized knowledge or improved performance in areas covered by the huangyt/FINETUNE2 dataset. Developers can leverage its fine-tuned capabilities for applications where a 13B parameter model with a 4096-token context window is appropriate.

Overview

Model Overview

Fine-Tuning Details

Performance Benchmarks

Use Cases

Full Model Card (README)