Name: loafeihong/llama-2-7B-factory-MetaMathQA-Muon-stage2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: loafeihong

Model Overview

This model, loafeihong/llama-2-7B-factory-MetaMathQA-Muon-stage2, is a specialized fine-tuned version of the Meta Llama 2 7B Chat model. It has been specifically adapted for mathematical reasoning tasks through training on the MetaMath dataset.

Key Characteristics

Base Model: Fine-tuned from meta-llama/Llama-2-7b-chat-hf.
Parameter Count: 7 billion parameters.
Optimization Focus: Enhanced for mathematical problem-solving and quantitative reasoning.

Training Details

The model was trained with the following key hyperparameters:

Learning Rate: 1e-05
Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.999) and epsilon=1e-08
LR Scheduler: Cosine type with a warmup ratio of 0.1
Epochs: 2.0
Total Batch Size: 16 (achieved with train_batch_size=1, gradient_accumulation_steps=2, and num_devices=8)

Intended Use Cases

This model is particularly well-suited for applications requiring:

Solving mathematical problems.
Generating explanations for mathematical concepts.
Assisting in quantitative analysis tasks.

Limitations

As noted in the original model card, further information regarding specific limitations and broader intended uses is needed. Users should evaluate its performance thoroughly for their specific mathematical applications.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Limitations

Full Model Card (README)