EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Feb 16, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT is an 8 billion parameter Llama-3.1 based language model developed by EpistemeAI, fine-tuned from Deepseek-R1-distill-llama-8b-unsloth-bnb-4bit. This model is optimized for neutrality, STEM proficiency, and ethical alignment, specifically enhanced for chemistry, mathematics, and general science tasks. It incorporates medical chain-of-thought fine-tuning and features a 32768 token context length, making it suitable for specialized scientific and medical reasoning applications.

Loading preview...

EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT Overview

This model, developed by EpistemeAI, is an 8 billion parameter Llama-3.1 based language model fine-tuned from unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit. It is specifically optimized for neutrality, STEM proficiency, and ethical alignment, incorporating supervised fine-tuning with medical chain of thought.

Key Capabilities

  • Neutral Worldview: Designed to minimize political and cultural bias through diverse training data and human feedback.
  • STEM Specialization: Demonstrates enhanced performance in:
    • Chemistry: Reaction mechanisms, periodic trends, spectroscopy.
    • Mathematics: Equation solving, proofs, calculus.
    • General Science: Hypothesis generation, research summarization.
  • Ethical Guardrails: Includes mechanisms to filter sensitive content and flag uncertain outputs.
  • Medical Chain of Thought: Fine-tuned to improve reasoning in medical contexts.

Training Details

The model was fine-tuned using Unsloth and Huggingface's TRL library, resulting in 2x faster training. It is licensed under Apache 2.0.

Limitations

  • May occasionally produce plausible but incorrect scientific explanations.
  • Not fully immune to subtle biases.
  • Not to be used for legal advice without expert oversight or for generating partisan/culturally insensitive content.