fblgit/UNA-POLAR-10.7B-InstructMath-v2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 2, 2024License:cc-by-nc-nd-4.0Architecture:Transformer0.0K Open Weights Warm

The fblgit/UNA-POLAR-10.7B-InstructMath-v2 is a 10.7 billion parameter instruction-tuned language model, built upon the UNA-SOLAR-10.7B-Instruct-1.0 architecture. This model has undergone DPO (Direct Preference Optimization) specifically using the MathPILE Books dataset, enhancing its mathematical reasoning capabilities. It is optimized for tasks requiring strong mathematical understanding and problem-solving. With a context length of 4096 tokens, it is suitable for processing moderately long mathematical queries and instructions.

Loading preview...

UNA-POLAR-10.7B-InstructMath-v2 Overview

The UNA-POLAR-10.7B-InstructMath-v2 is a 10.7 billion parameter instruction-tuned model developed by fblgit. It is a specialized version derived from the UNA-SOLAR-10.7B-Instruct-1.0 base model, distinguished by its targeted optimization for mathematical tasks.

Key Capabilities & Differentiators

  • Mathematical Reasoning: The model has been fine-tuned using Direct Preference Optimization (DPO) over the extensive MathPILE Books dataset, specifically designed to improve its performance in mathematical contexts.
  • Instruction Following: As an instruct-tuned model, it is designed to follow user instructions effectively, particularly for math-related queries.
  • Performance Benchmarks: Evaluation on the Open LLM Leaderboard shows a competitive average score of 74.07. Notable scores include 70.73 on AI2 Reasoning Challenge, 66.03 on MMLU, and 64.75 on GSM8k, indicating its proficiency in reasoning and mathematical problem-solving.

Ideal Use Cases

  • Mathematical Problem Solving: Excellent for applications requiring the model to understand and solve mathematical problems.
  • Educational Tools: Can be integrated into tools for learning or tutoring in mathematics.
  • Research & Development: Suitable for researchers exploring advanced mathematical reasoning in LLMs, especially those interested in the impact of DPO with specialized datasets like MathPILE.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p