CharlesLi/llama_2_llama_2_code_math_5_full
The CharlesLi/llama_2_llama_2_code_math_5_full model is a 7 billion parameter language model fine-tuned from Meta's Llama-2-7b-chat-hf. It is optimized for specific tasks, as indicated by its fine-tuning on a generator dataset, achieving a validation loss of 0.5808. This model is intended for applications requiring specialized performance derived from its Llama 2 base.
Loading preview...
Model Overview
This model, llama_2_llama_2_code_math_5_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf architecture. It has 7 billion parameters and was trained on a specific "generator dataset" to adapt its capabilities. The fine-tuning process resulted in a validation loss of 0.5808, indicating its performance on the evaluation set.
Training Details
The model was trained using the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 32 (total train batch size)
- Optimizer: Adam with default betas and epsilon
- LR Scheduler: Cosine with 0.1 warmup ratio
- Epochs: 1
Intended Use
While specific intended uses and limitations are not detailed in the provided information, its fine-tuning from a chat-optimized Llama 2 base suggests potential applications in conversational AI or tasks related to the nature of its "generator dataset." Users should conduct further evaluation to determine suitability for specific use cases.