Name: fzzhang/Marcoroni-neural-chat-7B-v2_gsm8k_merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: fzzhang

Model Overview

The fzzhang/Marcoroni-neural-chat-7B-v2_gsm8k_merged is a 7 billion parameter language model, building upon the Toten5/Marcoroni-neural-chat-7B-v2 base. This iteration has been specifically fine-tuned on the GSM8K dataset, indicating a strong focus on mathematical reasoning and problem-solving capabilities. The model was trained with a learning rate of 1e-05 over 5 epochs, utilizing Adam optimizer and a linear learning rate scheduler.

Key Capabilities

Mathematical Reasoning: Specialized in arithmetic and common sense reasoning, as evidenced by its fine-tuning on the GSM8K dataset.
Base Model Enhancement: Improves upon the Marcoroni-neural-chat-7B-v2 by adding domain-specific expertise.

Training Details

The fine-tuning process involved:

Learning Rate: 1e-05
Batch Sizes: train_batch_size of 4, eval_batch_size of 8
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
Epochs: 5
Frameworks: PEFT 0.7.2.dev0, Transformers 4.36.2, Pytorch 2.1.2, Datasets 2.16.1, Tokenizers 0.15.1

Good For

Applications requiring strong mathematical problem-solving.
Tasks involving common sense reasoning with numerical data.
Developers looking for a 7B model with enhanced arithmetic capabilities.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)