fblgit/UNA-34BeagleSimpleMath-32K-v1

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Jan 25, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

fblgit/UNA-34BeagleSimpleMath-32K-v1 is a 34 billion parameter language model fine-tuned from fblgit/UNA-34Beagles-32K-v1, which is based on The Bagel v0.3 and Yi-34B architectures. This model is specifically optimized for mathematical reasoning, demonstrating improved performance on the GSM8K benchmark compared to its base model. With a 32K token context length, it is designed for tasks requiring enhanced numerical and logical problem-solving capabilities.

Loading preview...

Model Overview

fblgit/UNA-34BeagleSimpleMath-32K-v1 is a 34 billion parameter language model, fine-tuned from the fblgit/UNA-34Beagles-32K-v1 base model. It leverages the underlying architectures of The Bagel v0.3 and Yi-34B, and was trained using the AXOLOTL framework. The primary objective of this fine-tuning was to enhance the model's performance in mathematical reasoning tasks, utilizing the fblgit/simple-math dataset.

Key Capabilities and Performance

This model demonstrates improved mathematical reasoning, as evidenced by its performance on the GSM8K benchmark, achieving an exact match score of 0.6505, a slight increase over the base model's 0.6399. It also maintains strong general reasoning abilities, with an MMLU score of 0.7524 and an ARC-Challenge accuracy of 0.7090. The model supports a substantial context length of 32,768 tokens, making it suitable for processing longer inputs.

Use Cases

  • Mathematical Problem Solving: Ideal for applications requiring accurate arithmetic and logical reasoning.
  • General Knowledge and Reasoning: Suitable for tasks benefiting from strong performance across various academic and common-sense benchmarks.
  • Long Context Applications: Its 32K context window allows for processing and understanding extensive documents or conversations.