nvidia/AceInstruct-1.5B

Warm
Public
1.5B
BF16
131072
License: cc-by-nc-4.0
Hugging Face
Overview

Overview

AceInstruct-1.5B is part of the AceInstruct family of advanced instruction-tuned models developed by Nvidia, built upon the Qwen2.5-Base architecture. This 1.5 billion parameter model is fine-tuned using general Supervised Fine-Tuning (SFT) datasets, which are also utilized in the training of the specialized AceMath-Instruct models. Unlike AceMath, AceInstruct is designed for broad applicability across various domains, including coding, mathematics, and general knowledge tasks.

Key Capabilities and Performance

AceInstruct-1.5B demonstrates strong performance across multiple benchmarks, often surpassing its base model counterpart, Qwen2.5-1.5B-Instruct. Key performance highlights include:

  • Coding: Achieves 73.17 on HumanEval and 65.76 on MBPP, outperforming Qwen2.5-1.5B-Instruct's 61.60 and 63.20 respectively.
  • Mathematics: Scores 80.44 on GSM8K and 60.34 on MATH, compared to Qwen2.5-1.5B-Instruct's 73.20 and 55.20.
  • General Knowledge: Maintains competitive performance on MMLU (58.17) and MMLU Pro (33.78).

Use Cases

AceInstruct-1.5B is suitable for a wide range of applications requiring instruction-following capabilities in:

  • Code generation and understanding
  • Mathematical problem-solving
  • General-purpose conversational AI and text generation

Licensing

This model is intended for non-commercial use only, adhering to the Creative Commons Attribution: Non-Commercial 4.0 International license, and is subject to the Terms of Use of data generated by OpenAI.