donoway/GSM8K-Binary_Llama-3.2-1B-g9v65nkk

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Aug 18, 2025License:llama3.2Architecture:Transformer Warm

The donoway/GSM8K-Binary_Llama-3.2-1B-g9v65nkk is a 1 billion parameter language model fine-tuned from Meta Llama-3.2-1B, designed for mathematical reasoning tasks. It achieves an overall accuracy of 69.90% on its evaluation set, with specific accuracies of 71.15% and 69.38% on two distinct label sets. This model is optimized for tasks requiring numerical problem-solving, leveraging its Llama-3.2 base for focused performance in this domain.

Loading preview...

Overview

This model, donoway/GSM8K-Binary_Llama-3.2-1B-g9v65nkk, is a 1 billion parameter language model derived from the meta-llama/Llama-3.2-1B architecture. It has been fine-tuned for specific tasks, as indicated by its evaluation metrics, though the exact dataset used for fine-tuning is not specified in the README.

Key Capabilities

  • Mathematical Reasoning Focus: The model's evaluation metrics, particularly the GSM8K-Binary in its name, suggest an optimization for mathematical problem-solving or similar reasoning tasks.
  • Performance Metrics: Achieved an overall accuracy of 69.90% on its evaluation set, with a validation loss of 0.7582. It shows varying accuracies across different label sets, reaching 71.15% on one and 69.38% on another.
  • Training Details: Trained for 100 epochs with a learning rate of 2e-05, a train batch size of 32, and an AdamW optimizer.

Good for

  • Numerical Problem Solving: Given its name and evaluation metrics, this model is likely suitable for tasks involving arithmetic, quantitative reasoning, or other mathematical challenges.
  • Resource-Constrained Environments: As a 1 billion parameter model, it offers a more compact footprint compared to larger LLMs, making it potentially efficient for deployment where computational resources are limited.