Name: CharlesLi/llama_2_llama_2_code_math_4_full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CharlesLi

Model Overview

CharlesLi/llama_2_llama_2_code_math_4_full is a 7 billion parameter language model, fine-tuned from the meta-llama/Llama-2-7b-chat-hf base model. Developed by CharlesLi, this iteration focuses on improving performance in specific domains, particularly those involving code and mathematical reasoning. The model was trained on a generator dataset, achieving a reported loss of 0.6615 on its evaluation set.

Key Training Details

The fine-tuning process utilized several specific hyperparameters:

Learning Rate: 2e-05
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
Batch Sizes: train_batch_size of 4, eval_batch_size of 4, leading to a total_train_batch_size of 32 and total_eval_batch_size of 16 (with 4 devices and 2 gradient accumulation steps).
Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1.
Epochs: Trained for 1 epoch.

Intended Use Cases

While specific intended uses and limitations require further information from the developer, the fine-tuning on a "generator dataset" and the model's name suggest an orientation towards tasks that benefit from enhanced generation capabilities, particularly in technical or analytical contexts like code generation or mathematical problem-solving. Developers should consider this model for applications where a Llama 2 base with specialized reasoning improvements is beneficial.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)