Name: CHIH-HUNG/llama-2-13b-dolphin_20w API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CHIH-HUNG

Model Overview

CHIH-HUNG/llama-2-13b-dolphin_20w is a 13 billion parameter language model developed by CHIH-HUNG. It is a fine-tuned version of the meta-llama/Llama-2-13b-hf base model, specifically trained on a subset of the ehartford/dolphin dataset. The training utilized the first 200,000 entries of this dataset, focusing on enhancing the model's instruction-following capabilities.

Fine-Tuning Details

The model was fine-tuned using LoRA (Low-Rank Adaptation) with a rank of 8, targeting the q_proj and v_proj layers. The training process involved a single epoch with a learning rate of 5e-5, utilizing bf16 precision and 4-bit quantization for efficiency. The training loss achieved was 0.8354 over approximately 28 hours.

Performance Benchmarks

Evaluations were conducted against the HuggingFaceH4/open_llm_leaderboard, comparing its performance with the base Llama-2-13b and other Dolphin-trained models across four key benchmarks:

ARC
HellaSwag
MMLU
TruthfulQA

The CHIH-HUNG/llama-2-13b-dolphin_20w model achieved an average score of 60.17, with specific scores of 59.56 on ARC, 82.55 on HellaSwag, 55.89 on MMLU, and 42.67 on TruthfulQA. These results indicate a strong performance profile, particularly in areas like common sense reasoning and language understanding, making it a robust choice for various natural language processing tasks.

Potential Use Cases

This model is well-suited for applications requiring:

Instruction-following and conversational AI: Its training on the Dolphin dataset emphasizes responding to instructions.
General text generation: Capable of producing coherent and contextually relevant text.
Question answering: Performance on MMLU and TruthfulQA suggests good knowledge recall and reasoning abilities.

Overview

Model Overview

Fine-Tuning Details

Performance Benchmarks

Potential Use Cases

Full Model Card (README)