Name: CHIH-HUNG/llama-2-13b-FINETUNE1_17w-q_k_v_o_proj API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Overview

This model, CHIH-HUNG/llama-2-13b-FINETUNE1_17w-q_k_v_o_proj, is a fine-tuned variant of the 13 billion parameter Llama-2 base model. Developed by CHIH-HUNG, it leverages the huangyt/FINETUNE1 dataset, which consists of approximately 170,000 data entries, to enhance its capabilities.

Fine-Tuning Details

The fine-tuning process utilized an RTX4090 GPU and employed LoRA (Low-Rank Adaptation) with a rank of 8. The LoRA targets included the q_proj, k_proj, v_proj, and o_proj layers. Training was conducted for 1 epoch with a learning rate of 5e-5, using bf16 precision and 4-bit quantization for efficiency. The training loss achieved was 0.688 over a runtime of 15 hours and 44 minutes.

Performance Benchmarks

Evaluations against the HuggingFaceH4/open_llm_leaderboard benchmarks show that this fine-tuned model generally outperforms the base meta-llama/Llama-2-13b-hf model across several key metrics:

Average Score: 58.49 (compared to 56.9 for base Llama-2-13b)
ARC: 59.73
HellaSwag: 81.06
MMLU: 54.53
TruthfulQA: 38.64

These results indicate an improvement in reasoning, common sense, and factual accuracy compared to the original Llama-2-13b model.

Recommended Use Cases

This model is well-suited for applications requiring enhanced general language understanding and generation, particularly where the base Llama-2-13b model's performance could be improved. Its fine-tuning on a diverse dataset suggests applicability in various conversational AI, text summarization, and question-answering scenarios.

Overview

Overview

Fine-Tuning Details

Performance Benchmarks

Recommended Use Cases

Full Model Card (README)