Overview
suyashdb/broken-model-fixed is an 8 billion parameter model derived from Qwen/Qwen3-8B. This repository addresses critical configuration and file omissions that previously rendered the model unusable for inference and standalone tokenizer loading. The primary fixes include correcting the base_model declaration, adding a complete Jinja2 chat_template to tokenizer_config.json, and uploading missing tokenizer files (vocab.json, tokenizer.json, special_tokens_map.json).
Key Fixes & Capabilities
- Corrected Base Model: The
base_model was updated from meta-llama/Meta-Llama-3.1-8B to Qwen/Qwen3-8B, aligning with the actual architecture and configuration. - Functional Chat Template: The absence of a
chat_template previously prevented OpenAI-compatible inference servers from processing /chat/completions requests. The added template supports system, user, and assistant messages, tool calls, and a thinking mode toggle. - Complete Tokenizer Files: Essential tokenizer files were uploaded, enabling the tokenizer to be loaded and used independently.
reasoning_effort Clarification
This model, like the base Qwen3-8B, does not natively support reasoning_effort parameters (e.g., "low", "high") as seen in OpenAI's o-series models. Qwen3-8B was not trained with budget-forcing, meaning it cannot dynamically adjust its reasoning depth based on such hints. It operates with a binary thinking/non-thinking mode. Implementing true reasoning_effort would require retraining with budget-forcing and specific inference server logic to interpret and apply these parameters.