Name: CHIH-HUNG/llama-2-13b-FINETUNE5_4w-r4-q_k_v_o API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Overview

This model, CHIH-HUNG/llama-2-13b-FINETUNE5_4w-r4-q_k_v_o, is a 13 billion parameter variant of the Llama-2 architecture. It has been fine-tuned by CHIH-HUNG using the LoRA (Low-Rank Adaptation) method on the huangyt/FINETUNE5 dataset, which contains approximately 40,000 training examples. The fine-tuning process utilized an RTX4090 GPU, with specific parameters including a LoRA rank of 16 targeting q_proj, k_proj, v_proj, and o_proj layers, a learning rate of 4e-4, and training for one epoch in bf16 precision with 4-bit quantization.

Performance Benchmarks

The model's performance was evaluated against the base Llama-2-13b across four standard benchmarks, with scores measured locally using 8-bit quantization:

Average Score: 56.09
ARC: 54.35
HellaSwag: 79.24
MMLU: 54.01
TruthfulQA: 36.75

These results indicate its capabilities in common reasoning, commonsense, and factual recall tasks. The fine-tuning process achieved a train_loss of 0.579 over a runtime of approximately 4 hours and 6 minutes using DeepSpeed.

Key Differentiators

Targeted Fine-tuning: Specialized LoRA fine-tuning on a custom dataset (huangyt/FINETUNE5) for potentially improved performance on tasks aligned with the dataset's content.
Efficient Training: Utilizes LoRA with 4-bit quantization and bf16 precision, making it efficient for deployment and further fine-tuning on consumer-grade hardware.

Recommended Use Cases

This model is suitable for applications requiring a Llama-2-13b base with enhanced performance on tasks similar to those found in the huangyt/FINETUNE5 dataset. It can be used for general text generation, question answering, and understanding tasks where a balance between model size and performance is desired.

Overview

Overview

Performance Benchmarks

Key Differentiators

Recommended Use Cases

Full Model Card (README)