Name: HuggingFaceTB/FineMath-Llama-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: HuggingFaceTB

Overview

HuggingFaceTB/FineMath-Llama-3B is a 3 billion parameter model built upon the Llama 3.2 architecture. It underwent continual pre-training by HuggingFaceTB using a substantial 160 billion token dataset, comprising 40% FineWeb-Edu and 60% FineMath (a high-quality math dataset). This specialized training regimen significantly enhances its mathematical proficiency.

Key Capabilities

Enhanced Mathematical Performance: Achieves superior results on math-related tasks compared to the base Llama 3.2 3B model.
Maintained General Intelligence: Preserves strong performance across knowledge, reasoning, and common sense benchmarks.
English Text Completion: Primarily intended for generating English text, particularly in contexts requiring mathematical understanding.

Training Details

The model was trained using nanotron on 64 H100 GPUs, leveraging datatrove for tokenization and lighteval for evaluation. It is part of a series of ablation models developed for the FineMath project.

Limitations

As it was predominantly trained on English math data, its performance in other languages may be limited. The model is not instruction-tuned and is intended for text completion rather than conversational or instruction-following tasks. Users should also be aware of potential biases or harmful content inherited from its training data.

Overview

Overview

Key Capabilities

Training Details

Limitations

Full Model Card (README)