Andy-ML-And-AI/HyperThinkCode-Qwen3-8B-v1.5
HyperThinkCode-Qwen3-8B-v1.5 is an 8 billion parameter LoRA fine-tune of the Qwen3-8B base model, developed by Andy-ML-And-AI. This model is specifically optimized for encouraging a 'thinking over direct response' approach, trained on a subset of the Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K dataset. It features a 32768 token context length and is designed for tasks requiring structured reasoning, particularly in code-related contexts.
Loading preview...
Model Overview
HyperThinkCode-Qwen3-8B-v1.5 is an 8 billion parameter model developed by Andy-ML-And-AI, built upon the Qwen3-8B base architecture. It is a LoRA fine-tune, specifically configured with 4-bit QLoRA using a rank of 16 and alpha of 16, applied across all linear layers including attention (q, k, v, o) and MLP (gate, up, down).
Key Capabilities & Training
This model was trained on a 30k subset of the Sashvat/HyperThink-X-Nvidia-Opencode-Reasoning-200K dataset. A core objective of its training was to encourage a 'thinking over direct response' behavior, utilizing a chat template where the assistant's response is placed within a 'thinking' field. The training process involved approximately 1 hour and 17 minutes over 50 steps, with a sequence length limited to 4096 tokens to manage code complexity and VRAM constraints. Initial training logs show a decreasing loss, indicating effective learning.
Use Cases & Evaluation
While evaluation is ongoing using the lm-eval library for benchmarks like HumanEval (coding) and GSM8K (math), the model's specialized training suggests its suitability for tasks requiring structured reasoning and problem-solving, particularly in programming and logical deduction. Its design to prioritize a 'thinking' process makes it potentially valuable for applications where step-by-step reasoning is more critical than immediate answers.