Name: DeeWoo/Llama-2-7b-chat_FFT_GSM8K API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DeeWoo

Overview

DeeWoo/Llama-2-7b-chat_FFT_GSM8K is a 7 billion parameter language model, fine-tuned from Meta's Llama-2-7b-chat-hf. This model has undergone specialized training on the GSM8K dataset, which focuses on grade-school level mathematical word problems. The fine-tuning process aimed to enhance the model's capabilities in numerical reasoning and problem-solving.

Key Capabilities

Mathematical Reasoning: Optimized for solving arithmetic and word problems, particularly those found in the GSM8K dataset.
Llama-2 Foundation: Benefits from the robust architecture and general language understanding of the base Llama-2-7b-chat model.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 1e-05
Batch Size: A total training batch size of 64 (16 per GPU across 4 devices).
Epochs: Trained for 3.0 epochs.
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08.
Mixed Precision: Utilized Native AMP for efficient training.

Good For

Applications requiring accurate solutions to mathematical problems.
Research into fine-tuning large language models for specific reasoning tasks.
Educational tools focused on math assistance.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)