mkurman/llama-3.2-MEDIT-3B-o1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Jan 3, 2025License:llama3.2Architecture:Transformer0.0K Warm

mkurman/llama-3.2-MEDIT-3B-o1 is a 3 billion parameter Small Language Model (SLM) developed by mkurman, fine-tuned from MedIT Solutions Llama 3.2 3B Instruct. This model specializes in o1-like reasoning, utilizing specific and tags for chain-of-thought generation, and is optimized for deterministic outputs in instruct-style reasoning tasks. It is intended for general question answering and instruction-based generation, focusing on structured reasoning.

Loading preview...

Model Overview

mkurman/llama-3.2-MEDIT-3B-o1 is a 3 billion parameter Small Language Model (SLM) fine-tuned by mkurman from MedIT Solutions Llama 3.2 3B Instruct. Its core differentiation lies in its o1-like reasoning capabilities, achieved through the use of explicit <Thought> and <Output> tags. This structure encourages a chain-of-thought style of text generation, making it particularly suitable for tasks requiring structured reasoning.

Key Characteristics & Usage

  • Reasoning Focus: Designed for instruct-style reasoning tasks, leveraging <Thought> and <Output> tags to separate internal reasoning from the final answer.
  • Deterministic Outputs: Recommended for use with do_sample=False or temperature=0.0 to achieve more deterministic and exact matching outputs rather than diverse generation.
  • Base Model: Built upon the Llama 3.2 3B Instruct architecture, licensed under llama3.2.
  • Intended Use Cases:
    • General question answering
    • Instruction-based generation
    • Exploration of reasoning and chain-of-thought processes

Important Usage Notes

  • Stop Sequences: Users must configure </Output> as a stop sequence during generation to prevent infinite output.
  • Prompt Workaround: A known issue where the model might start with <|python_tag|> can be mitigated by appending "<Thought>\n\n" to the end of the generation prompt.

Limitations

  • Hallucination: Like many LLMs, it may generate plausible but incorrect information.
  • Medical Information: Not to be relied upon for sensitive, real-world medical diagnosis or advice without expert verification. It is not a certified medical professional.
  • Bias: Outputs may reflect biases present in its training data.