nbeerbower/mistral-nemo-wissenschaft-12B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Aug 12, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

nbeerbower/mistral-nemo-wissenschaft-12B is a 12 billion parameter language model fine-tuned from Mistral-Nemo-Instruct-2407. This model specializes in scientific question answering, having been optimized on the ScienceQA_text_only dataset. It features a 32768 token context length and is designed for tasks requiring scientific knowledge and reasoning.

Loading preview...

Model Overview

nbeerbower/mistral-nemo-wissenschaft-12B is a 12 billion parameter language model derived from the Mistral-Nemo-Instruct-2407 base model. It has been specifically fine-tuned on the tasksource/ScienceQA_text_only dataset, focusing on enhancing its performance in scientific question-answering tasks. The fine-tuning process involved using an A100 GPU on Google Colab for one epoch, employing a method where correct answers were selected as 'chosen' and random wrong answers as 'rejected'.

Key Characteristics

  • Base Model: Mistral-Nemo-Instruct-2407
  • Parameter Count: 12 Billion
  • Context Length: 32768 tokens
  • Specialization: Optimized for scientific question answering and reasoning.

Performance Insights

Evaluations on the Open LLM Leaderboard indicate its performance across various benchmarks. While its average score is 24.58, it shows a notable IFEval (0-Shot) score of 65.20. Other scores include BBH (3-Shot) at 29.57, MATH Lvl 5 (4-Shot) at 6.57, GPQA (0-shot) at 5.70, MuSR (0-shot) at 12.29, and MMLU-PRO (5-shot) at 28.14. Detailed evaluation results are available on the Open LLM Leaderboard.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

  • Scientific Q&A systems: Answering questions based on scientific texts or concepts.
  • Educational tools: Assisting in learning and understanding scientific subjects.
  • Research support: Generating insights or summaries from scientific literature.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p