keypa/Qwen3.5-9B-Claude-Opus-4.7
keypa/Qwen3.5-9B-Claude-Opus-4.7 is a 9-billion parameter language model, fine-tuned from Qwen/Qwen3.5-9B by keypa. Optimized for enhanced reasoning and step-by-step logical deduction, it leverages distillation from Claude Opus 4.7's high-quality reasoning chains. This model aims to provide advanced problem-solving capabilities efficiently on consumer-grade hardware, with a context length of 32768 tokens.
Loading preview...
Overview
This model, developed by keypa, is a 9-billion parameter language model fine-tuned from Qwen/Qwen3.5-9B. Its primary focus is on enhanced reasoning and step-by-step logical deduction, achieved through distillation from high-quality reasoning chains generated by Claude Opus 4.7. The goal is to bring advanced reasoning capabilities, typically associated with larger models, to a more compact and efficient 9B parameter size, making it suitable for complex problem-solving on consumer-grade hardware.
Key Capabilities
- Advanced Reasoning: Specifically optimized for logical deduction and step-by-step problem-solving.
- Efficiency: Designed to run effectively on consumer-grade hardware due to its compact 9B parameter size.
- Distillation Quality: Benefits from knowledge distilled from Claude Opus 4.7, a high-quality reasoning source.
- Context Length: Supports a substantial context length of 32768 tokens.
Training Details
The model was trained using a LoRA (Low-Rank Adaptation) approach, targeting all linear modules to maximize knowledge retention from the distillation source. It underwent 2 epochs and 1,016 total steps with a learning rate of 2e-4, achieving a final loss of approximately 1.1074. The training utilized 4-bit QLoRA precision and was accelerated by Unsloth and Huggingface's TRL library.
Usage
This model uses the standard ChatML prompt format. For optimal results in reasoning tasks, users are encouraged to explicitly ask the model to "Think step by step" or provide prompts that require logical deduction.