Xinging/llama2-13b_sft_0.1_ratio_alpaca_gpt4_proj_by_human_eval_ntrain_378
The Xinging/llama2-13b_sft_0.1_ratio_alpaca_gpt4_proj_by_human_eval_ntrain_378 model is a 13 billion parameter language model fine-tuned from Meta's Llama-2-13b-hf. It was fine-tuned on the 0.1_ratio_alpaca_gpt4_proj_by_human_eval_ntrain_378 dataset, suggesting a focus on instruction-following and potentially code-related tasks. This model is designed for applications requiring a Llama 2-based instruction-tuned model with a 4096-token context length.
Loading preview...
Model Overview
Xinging/llama2-13b_sft_0.1_ratio_alpaca_gpt4_proj_by_human_eval_ntrain_378 is a 13 billion parameter language model derived from the meta-llama/Llama-2-13b-hf base model. It has been fine-tuned using the 0.1_ratio_alpaca_gpt4_proj_by_human_eval_ntrain_378 dataset, indicating an instruction-following specialization, potentially with a focus on code generation or evaluation based on the dataset name.
Training Details
The model was trained with the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 16 (train), 8 (eval)
- Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
- Scheduler: Cosine learning rate scheduler with a 0.03 warmup ratio
- Epochs: 1.0
Intended Use Cases
This model is suitable for applications that benefit from a Llama 2-based instruction-tuned model, particularly those requiring a 13 billion parameter model with a 4096-token context window. Its fine-tuning dataset suggests potential strengths in tasks related to instruction following and possibly code-centric prompts, though specific performance metrics are not provided in the README.