mlfoundations-dev/seed_math_college_math

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 28, 2025License:llama3.1Architecture:Transformer Warm

The mlfoundations-dev/seed_math_college_math model is a fine-tuned version of Meta-Llama-3.1-8B, developed by mlfoundations-dev. This model is specifically trained on the mlfoundations-dev/seed_math_college_math dataset, focusing on mathematical reasoning. It is optimized for tasks related to college-level mathematics, demonstrating a validation loss of 0.9592.

Loading preview...

Model Overview

mlfoundations-dev/seed_math_college_math is a specialized language model fine-tuned from Meta-Llama-3.1-8B. Developed by mlfoundations-dev, its primary focus is on mathematical reasoning, specifically within the domain of college-level mathematics.

Key Capabilities

  • Mathematical Problem Solving: Optimized for understanding and generating responses related to college-level math problems.
  • Fine-tuned Performance: Achieved a validation loss of 0.9592 during training, indicating its proficiency in the targeted mathematical domain.
  • Llama 3.1 Base: Benefits from the robust architecture and general language understanding capabilities of the Meta-Llama-3.1-8B foundation model.

Training Details

The model was trained using the mlfoundations-dev/seed_math_college_math dataset with the following key hyperparameters:

  • Learning Rate: 5e-06
  • Batch Size: 8 (train and eval), with a total effective train batch size of 512 due to gradient accumulation.
  • Epochs: 3.0
  • Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08.

Intended Use Cases

This model is particularly well-suited for applications requiring:

  • Assistance with college mathematics problems.
  • Educational tools focused on math concepts.
  • Research into mathematical reasoning capabilities of LLMs.