Model Overview

This model, llama_2_llama_2_code_math_5_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf architecture. It has 7 billion parameters and was trained on a specific "generator dataset" to adapt its capabilities. The fine-tuning process resulted in a validation loss of 0.5808, indicating its performance on the evaluation set.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 2e-05
Batch Size: 32 (total train batch size)
Optimizer: Adam with default betas and epsilon
LR Scheduler: Cosine with 0.1 warmup ratio
Epochs: 1

Intended Use

While specific intended uses and limitations are not detailed in the provided information, its fine-tuning from a chat-optimized Llama 2 base suggests potential applications in conversational AI or tasks related to the nature of its "generator dataset." Users should conduct further evaluation to determine suitability for specific use cases.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)