CharlesLi/llama_3_alpaca_llama_2
The CharlesLi/llama_3_alpaca_llama_2 is an 8 billion parameter language model, fine-tuned from Meta's Llama-3.1-8B-Instruct. This model was trained for 30 steps with a learning rate of 0.0002 and achieved a validation loss of 0.8428. It is intended for general language generation tasks, building upon the capabilities of its Llama-3.1 base.
Loading preview...
Model Overview
The CharlesLi/llama_3_alpaca_llama_2 is an 8 billion parameter language model, fine-tuned from the meta-llama/Llama-3.1-8B-Instruct base model. This fine-tuning process involved 30 training steps, utilizing a learning rate of 0.0002 and a total batch size of 16 across two GPUs. The model achieved a final validation loss of 0.8428.
Training Details
- Base Model:
meta-llama/Llama-3.1-8B-Instruct - Parameters: 8 Billion
- Learning Rate: 0.0002
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Scheduler: Cosine with 0.1 warmup ratio
- Training Steps: 30
- Final Validation Loss: 0.8428
Intended Use
This model is suitable for general instruction-following tasks, leveraging the robust capabilities inherited from the Llama-3.1-8B-Instruct architecture. Specific use cases and limitations are not detailed in the provided information, suggesting a broad applicability for various language generation and understanding tasks.