mkurman/llama-3.2-MEDIT-3B-o1
mkurman/llama-3.2-MEDIT-3B-o1 is a 3 billion parameter Small Language Model (SLM) developed by mkurman, fine-tuned from MedIT Solutions Llama 3.2 3B Instruct. This model specializes in o1-like reasoning, utilizing specific and tags for chain-of-thought generation, and is optimized for deterministic outputs in instruct-style reasoning tasks. It is intended for general question answering and instruction-based generation, focusing on structured reasoning.
Loading preview...
Model Overview
mkurman/llama-3.2-MEDIT-3B-o1 is a 3 billion parameter Small Language Model (SLM) fine-tuned by mkurman from MedIT Solutions Llama 3.2 3B Instruct. Its core differentiation lies in its o1-like reasoning capabilities, achieved through the use of explicit <Thought> and <Output> tags. This structure encourages a chain-of-thought style of text generation, making it particularly suitable for tasks requiring structured reasoning.
Key Characteristics & Usage
- Reasoning Focus: Designed for instruct-style reasoning tasks, leveraging
<Thought>and<Output>tags to separate internal reasoning from the final answer. - Deterministic Outputs: Recommended for use with
do_sample=Falseortemperature=0.0to achieve more deterministic and exact matching outputs rather than diverse generation. - Base Model: Built upon the Llama 3.2 3B Instruct architecture, licensed under llama3.2.
- Intended Use Cases:
- General question answering
- Instruction-based generation
- Exploration of reasoning and chain-of-thought processes
Important Usage Notes
- Stop Sequences: Users must configure
</Output>as a stop sequence during generation to prevent infinite output. - Prompt Workaround: A known issue where the model might start with
<|python_tag|>can be mitigated by appending"<Thought>\n\n"to the end of the generation prompt.
Limitations
- Hallucination: Like many LLMs, it may generate plausible but incorrect information.
- Medical Information: Not to be relied upon for sensitive, real-world medical diagnosis or advice without expert verification. It is not a certified medical professional.
- Bias: Outputs may reflect biases present in its training data.