Name: CharlesLi/llama_2_llama_2_code_math_1_full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CharlesLi

Model Overview

CharlesLi/llama_2_llama_2_code_math_1_full is a 7 billion parameter language model derived from the meta-llama/Llama-2-7b-chat-hf base model. It has undergone fine-tuning on a specific generator dataset, achieving a reported loss of 0.8356 on its evaluation set.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 2e-05
Batch Sizes: train_batch_size of 4, eval_batch_size of 4
Gradient Accumulation: 2 steps, leading to a total_train_batch_size of 32
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.1
Epochs: 1

The training utilized a multi-GPU setup with 4 devices. The framework versions included Transformers 4.44.2, Pytorch 2.4.1+cu121, Datasets 3.0.0, and Tokenizers 0.19.1.

Intended Use

While specific intended uses and limitations require more detailed information, this model is generally suitable for tasks aligned with its Llama 2 foundation and the characteristics of its fine-tuning dataset. Developers should consider its base architecture and training specifics when evaluating its applicability for their particular use cases.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)