lkevinzc/Llama-3.2-3B-NuminaQA

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Mar 6, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

lkevinzc/Llama-3.2-3B-NuminaQA is a 3 billion parameter language model based on the FineMath-Llama-3B architecture, fine-tuned by lkevinzc. It is specifically optimized for question-answering tasks, utilizing the numia-1.5-qa-concatenated dataset. This model serves as a foundational component in a minimalist R1-Zero recipe, focusing on understanding R1-Zero-like training.

Loading preview...

Overview

lkevinzc/Llama-3.2-3B-NuminaQA is a 3 billion parameter model derived from the FineMath-Llama-3B base. It has been fine-tuned for 2 epochs with a learning rate of 1e-5 on the lkevinzc/numia-1.5-qa-concatenated dataset. This model is presented as a core component within a minimalist R1-Zero recipe, developed as part of research into understanding R1-Zero-like training methodologies.

Key Capabilities

  • Question Answering (QA): Specifically fine-tuned on a QA dataset, indicating a focus on generating accurate answers to questions.
  • R1-Zero Training Research: Serves as a base model for exploring and understanding R1-Zero-like training paradigms.

Good For

  • Research in R1-Zero Training: Ideal for researchers investigating the dynamics and effectiveness of R1-Zero-like training methods.
  • Question Answering Applications: Suitable for integration into systems requiring a compact model for question-answering tasks, particularly those aligned with the numia-1.5-qa-concatenated dataset's domain.

For more technical details, refer to the associated paper and the GitHub repository.