Name: CharlesLi/llama_2_cot_simplest_code_math_4_full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CharlesLi

Model Overview

CharlesLi/llama_2_cot_simplest_code_math_4_full is a 7 billion parameter language model derived from Meta's Llama-2-7b-chat-hf. This model has undergone fine-tuning with a focus on enhancing its performance in specific reasoning and mathematical tasks, as indicated by a training loss of 0.6062 on its evaluation set.

Key Characteristics

Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
Parameter Count: 7 billion parameters.
Context Length: Supports a context window of 4096 tokens.
Training Objective: Optimized for tasks requiring reasoning and mathematical problem-solving.
Training Loss: Achieved a loss of 0.6062 on the evaluation set.

Training Details

The model was trained using the following hyperparameters:

Learning Rate: 2e-05
Batch Size: 4 (train), 4 (eval)
Gradient Accumulation Steps: 2, leading to a total train batch size of 32.
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08.
LR Scheduler: Cosine with a warmup ratio of 0.1 over 1 epoch.

Intended Use Cases

This model is suitable for applications where robust logical reasoning and accurate mathematical computations are critical. Its fine-tuning suggests potential strengths in areas such as:

Solving mathematical word problems.
Executing multi-step reasoning tasks.
Code-related logical inference.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)