Name: ngxson/MiniThinky-1B-Llama-3.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ngxson

MiniThinky 1B Overview

MiniThinky 1B is a 1 billion parameter model based on the Llama 3.2 architecture, developed by ngxson. This model represents an initial fine-tuning effort focused on integrating a distinct reasoning capability into a smaller language model. It operates with a substantial context length of 32768 tokens, allowing for more extensive input processing.

Key Capabilities & Features

Explicit Reasoning Process: The model is designed to output a "thinking process" (<|thinking|>) before delivering its final answer (<|answer|>), providing insight into its internal deliberation.
Llama 3 Chat Template: Utilizes the standard Llama 3 chat template for interaction, ensuring compatibility with existing Llama 3-based pipelines.
System Message Sensitivity: Requires a precise system message at the beginning of conversations to activate its intended reasoning behavior. The recommended prompt is: You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <|thinking|> before thinking and <|answer|> before giving the answer.

Important Considerations

Experimental Nature: This model is described as an initial trial to add reasoning, indicating its developmental stage.
Performance: While trained for 5 hours on 4xL40S GPUs with an evaluation loss of approximately 0.8, specific benchmarks for reasoning capabilities are not yet available.
Limitations: The model currently does not excel at simple counting tasks (e.g., counting letters in a word).
Newer Version Available: Users are directed to a newer checkpoint, MiniThinky-v2-1B-Llama-3.2, for potentially improved performance.

Good For

Developers interested in experimenting with small models that exhibit an explicit reasoning step.
Use cases where understanding the model's thought process is as important as the final answer.
Further research and fine-tuning efforts to enhance reasoning in compact LLMs.