Name: suyashdb/broken-model-fixed API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: suyashdb

Overview

suyashdb/broken-model-fixed is an 8 billion parameter model derived from Qwen/Qwen3-8B. This repository addresses critical configuration and file omissions that previously rendered the model unusable for inference and standalone tokenizer loading. The primary fixes include correcting the base_model declaration, adding a complete Jinja2 chat_template to tokenizer_config.json, and uploading missing tokenizer files (vocab.json, tokenizer.json, special_tokens_map.json).

Key Fixes & Capabilities

Corrected Base Model: The base_model was updated from meta-llama/Meta-Llama-3.1-8B to Qwen/Qwen3-8B, aligning with the actual architecture and configuration.
Functional Chat Template: The absence of a chat_template previously prevented OpenAI-compatible inference servers from processing /chat/completions requests. The added template supports system, user, and assistant messages, tool calls, and a thinking mode toggle.
Complete Tokenizer Files: Essential tokenizer files were uploaded, enabling the tokenizer to be loaded and used independently.

`reasoning_effort` Clarification

This model, like the base Qwen3-8B, does not natively support reasoning_effort parameters (e.g., "low", "high") as seen in OpenAI's o-series models. Qwen3-8B was not trained with budget-forcing, meaning it cannot dynamically adjust its reasoning depth based on such hints. It operates with a binary thinking/non-thinking mode. Implementing true reasoning_effort would require retraining with budget-forcing and specific inference server logic to interpret and apply these parameters.

Overview

Overview

Key Fixes & Capabilities

reasoning_effort Clarification

Full Model Card (README)

`reasoning_effort` Clarification