CHIH-HUNG/llama-2-13b-dolphin_5w
CHIH-HUNG/llama-2-13b-dolphin_5w is a 13 billion parameter language model fine-tuned by CHIH-HUNG on the Meta Llama 2 architecture. It was trained using the first 50,000 entries of the ehartford/dolphin dataset, focusing on instruction-following tasks. This model demonstrates improved performance across benchmarks like ARC, HellaSwag, MMLU, and TruthfulQA compared to its base Llama-2-13b counterpart, making it suitable for general conversational and question-answering applications.
Loading preview...
Model Overview
CHIH-HUNG/llama-2-13b-dolphin_5w is a 13 billion parameter language model built upon the Meta Llama 2 architecture. It has been fine-tuned by CHIH-HUNG using a subset of the ehartford/dolphin dataset, specifically the first 50,000 entries, to enhance its instruction-following capabilities.
Fine-Tuning Details
The model was fine-tuned using LoRA (rank 8) targeting q_proj and v_proj layers, with a learning rate of 5e-5 over 1 epoch. The training utilized an RTX4090 GPU with bf16 precision and 4-bit quantization, achieving a train_loss of 0.8799 over a runtime of 7 hours and 11 minutes.
Performance Benchmarks
Evaluated against the HuggingFaceH4/open_llm_leaderboard, CHIH-HUNG/llama-2-13b-dolphin_5w shows competitive performance across several benchmarks when compared to the base Llama-2-13b and other dolphin-tuned models:
- Average Score: 61.0 (highest among compared models)
- ARC: 60.67
- HellaSwag: 82.69
- MMLU: 56.23
- TruthfulQA: 44.41
This model notably surpasses the base meta-llama/Llama-2-13b-hf and meta-llama/Llama-2-13b-chat-hf in average score and individual benchmarks, indicating improved general reasoning and instruction adherence.
Recommended Use Cases
This model is well-suited for applications requiring robust instruction following and general conversational AI, particularly where a balance between performance and computational efficiency for a 13B parameter model is desired. Its strong benchmark results suggest proficiency in common language understanding and generation tasks.