Name: wei123602/llama2-13b-fintune2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wei123602

Model Overview

This model, wei123602/llama2-13b-fintune2, is a 13 billion parameter language model built upon the meta-llama/Llama-2-13b-hf base architecture. It has been fine-tuned by wei123602 using the huangyt/FINETUNE2 dataset, which contains approximately 30,000 training entries. The fine-tuning process utilized LoRA (Low-Rank Adaptation) with specific parameters, including a LoRA rank of 16, alpha of 8, and a dropout of 0.05, targeting gate_proj, up_proj, and down_proj modules.

Training Details

The model was trained for 1 epoch on a single RTX4090 GPU, achieving a train_loss of 0.0823 over a train_runtime of 2 hours and 40 minutes. Key training parameters included a per_device_train_batch_size of 8, gradient_accumulation_steps of 8, and a learning_rate of 4e-4. The cutoff_length for sequences was 2048 tokens, and training was performed in bf16 precision.

Evaluation

Evaluation results for this specific model are currently pending on the HuggingFaceH4/open_llm_leaderboard. However, the README provides comparative benchmarks for other Llama-2-13b variants on metrics such as ARC, HellaSwag, MMLU, and TruthfulQA, indicating the general performance range for models in this family. The fine-tuning approach aims to enhance performance on general language tasks through targeted dataset training.

Overview

Model Overview

Training Details

Evaluation

Full Model Card (README)