CharlesLi/llama_2_llama_2_alpaca_1_full
The CharlesLi/llama_2_llama_2_alpaca_1_full model is a 7 billion parameter language model, fine-tuned from Meta's Llama-2-7b-chat-hf. It is specifically adapted using a generator dataset, achieving a reported loss of 1.3132 on its evaluation set. This model is intended for tasks benefiting from its specialized fine-tuning, though specific use cases and limitations require further definition.
Loading preview...
Model Overview
The CharlesLi/llama_2_llama_2_alpaca_1_full is a 7 billion parameter language model derived from the meta-llama/Llama-2-7b-chat-hf architecture. This model has undergone fine-tuning on a specific "generator dataset," indicating a specialization in generative tasks. During its training, it achieved a loss of 1.3132 on the evaluation set, suggesting a degree of proficiency in the tasks it was fine-tuned for.
Training Details
The model was trained using the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: A total training batch size of 32 (with
train_batch_size: 4andgradient_accumulation_steps: 2across 4 GPUs). - Optimizer: Adam with standard betas and epsilon.
- Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
- Epochs: Trained for 1 epoch.
Current Status
As per the provided information, specific details regarding the model's intended uses, limitations, and the exact nature of the training and evaluation data are not yet fully documented. Developers should consider this when evaluating its suitability for particular applications.