Qwen 3 & GLM 4: Next-Generation Thinking Models Now on Featherless.ai
Announcing Qwen 3 and GLM, two powerful families of advanced large language models, are now available through Featherless.ai's serverless inference platform.

We're excited to announce that Qwen 3 and GLM 4, two powerful families of advanced large language models, are now available through Featherless.ai's serverless inference platform. These models represent significant advancements in AI capabilities, offering innovative thinking modes and impressive performance across benchmarks.
Qwen 3 on Featherless.ai: Advanced AI with Dual Thinking Capabilities
Qwen 3 delivers exceptional performance through our serverless API, letting you build powerful applications without managing infrastructure. The standout feature of Qwen 3 is its hybrid thinking approach:
Thinking Mode: For complex problems requiring step-by-step reasoning
Non-Thinking Mode: For quick responses to simpler queries
This innovative dual-mode capability allows developers to optimize the balance between computational budget and inference quality based on the specific needs of each task.
The Qwen Family: Available Models
While the Qwen 3 family includes multiple sizes to suit different needs, we currently offer the following models on Featherless.ai:
These carefully selected models provide an optimal range of options for your specific application requirements, from lightweight implementations to powerful reasoning capabilities.
Introducing GLM 4: Powerful Context Windows and Advanced Reasoning
We're also excited to announce the addition of GLM 4 models from Tsinghua KEG (THUDM):
GLM-4-9B-0414: This model delivers performance superior to Llama-3-8B-Instruct while maintaining efficiency.
GLM-4-32B-0414: Our larger GLM offering with 32 billion parameters, featuring performance comparable to OpenAI's GPT series and DeepSeek's V3/R1 series. Pre-trained on 15T of high-quality data including substantial reasoning-type synthetic data, this model excels in instruction following, engineering code, function calling, and agent tasks.
GLM-4 models stand out with their exceptional context handling, powerful reasoning capabilities, and impressive benchmarks across various domains including code generation, artifact creation, and complex Q&A tasks.
Key Capabilities
These models stand out with several impressive capabilities:
Hybrid Thinking Modes: Switch between deep reasoning and quick responses
Extensive Multilingual Support: 119 languages and dialects with Qwen 3
Enhanced Agentic Capabilities: Powerful coding and tool-usage abilities
Competitive Performance: Benchmarks comparable to top models like DeepSeek-R1, o1, and Gemini-2.5-Pro
Try These Models on Featherless.ai Today
Qwen 3 and GLM 4 models are ready for immediate use through our platform. Whether you're building applications that need dynamic reasoning capabilities, multilingual support, or extended context handling, our serverless API provides the simplest path to integration.
Chat with them on Phoenix
Integrate via the Featherless API
Explore our documentation: Check out our implementation guides and example code
Have questions about using these models through our serverless platform? Reach out to us on Discord or check our documentation for API references and best practices.
For more details on Qwen 3, visit: https://qwenlm.github.io/blog/qwen3/
For more on GLM 4, visit: https://github.com/THUDM/GLM-4