Model Overview

This model, llama_2_cot_simplest_alpaca_4_3_epoch_full, is a fine-tuned variant of the Meta Llama-2-7b-chat-hf base model. It has 7 billion parameters and was trained for 3 epochs on a specific generator dataset. The training process utilized a learning rate of 2e-05, a batch size of 32 (with gradient accumulation), and an Adam optimizer.

Training Details

The model was trained using a multi-GPU setup (4 devices) with a cosine learning rate scheduler and a warmup ratio of 0.1. During training, the validation loss reached 1.0590. The training was conducted using Transformers 4.44.2, Pytorch 2.4.1+cu121, Datasets 3.0.0, and Tokenizers 0.19.1.

Key Characteristics

Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
Parameter Count: 7 billion parameters.
Training Epochs: 3 epochs on a generator dataset.
Validation Loss: Achieved 1.0590 on the evaluation set.

Intended Use Cases

Given its fine-tuning on a generator dataset, this model is likely suitable for tasks requiring text generation and conversational AI, building upon the robust capabilities of the Llama 2 architecture.

Overview

Model Overview

Training Details

Key Characteristics

Intended Use Cases

Full Model Card (README)