Name: CharlesLi/llama_2_cot_simplest_code_math_0_full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CharlesLi

Overview

This model, llama_2_cot_simplest_code_math_0_full, is a fine-tuned version of the meta-llama/Llama-2-7b-chat-hf base model, developed by CharlesLi. It has 7 billion parameters and was trained with a context length of 4096 tokens. The fine-tuning process utilized a specific "generator dataset," suggesting an optimization for generating content, likely within the domains of code and mathematics as implied by the model's naming convention.

Training Details

The model underwent training with a learning rate of 2e-05, a batch size of 4 (total effective batch size of 32 with gradient accumulation), and for 1 epoch. It used an Adam optimizer and a cosine learning rate scheduler with a 0.1 warmup ratio. The training was conducted on a multi-GPU setup with 4 devices. The reported loss on the evaluation set was 0.8119.

Key Characteristics

Base Model: Llama-2-7b-chat-hf
Parameter Count: 7 billion
Context Length: 4096 tokens
Fine-tuning Focus: Generator dataset, likely for code and mathematical tasks.

Intended Use

While specific intended uses and limitations are not detailed in the provided README, its fine-tuning on a generator dataset and naming suggest it is designed for tasks requiring generation in technical or analytical contexts, such as code snippets, mathematical problem-solving, or logical reasoning. Developers should consider its specialized fine-tuning for applications where enhanced performance in these areas is critical.

Overview

Overview

Training Details

Key Characteristics

Intended Use

Full Model Card (README)