shaohongwu/Qwen2.5-0.5B-Preweb-special-tokens
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 12, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

shaohongwu/Qwen2.5-0.5B-Preweb-special-tokens is a 0.5 billion parameter derivative base model of Qwen/Qwen2.5-0.5B, developed by shaohongwu. This model extends the tokenizer vocabulary with specific schema/control special tokens, making it optimized for structured information extraction, schema-aware prompting, and slot/intent/domain prediction. It is designed for use as a base model where downstream LoRA adapters are trained with the same extended tokenizer.

Loading preview...

Qwen2.5-0.5B-Preweb-special-tokens Overview

This model is a specialized 0.5 billion parameter derivative base model built upon Qwen/Qwen2.5-0.5B. Its primary distinction lies in its extended tokenizer vocabulary, which incorporates a set of schema/control special tokens. These additions are crucial for enabling more structured and precise language processing tasks.

Key Capabilities & Features

  • Extended Tokenizer: Includes special tokens such as <|domain_start|>, <|intent_start|>, <|slot_type_start|>, <|slot_span_start|>, and <|canonical_start|>, along with their corresponding end tokens.
  • Increased Vocabulary Size: The addition of these tokens expands the model's tokenizer vocabulary, allowing for finer-grained control and representation of structured data.
  • Fixed Vocabulary: The vocabulary shapes are fixed, meaning tokens should not be added or removed at runtime to maintain compatibility.
  • Optimized for Serving: Designed with compatibility for vLLM and TensorRT-LLM serving environments, supporting multi-LoRA dynamic loading.

Intended Use Cases

This model is specifically engineered as a base for applications requiring:

  • Schema-aware prompting: Facilitating interactions where the model needs to understand and adhere to predefined data schemas.
  • Structured information extraction: Efficiently extracting specific pieces of information from text into a structured format.
  • Slot / Intent / Domain prediction: Core tasks in natural language understanding (NLU) for conversational AI and similar systems.

Downstream LoRA adapters must be trained using this model's specific tokenizer to ensure proper functionality and leverage its specialized token set.