Gandalf1/qwen3-8b-finance-finqa-phase3-merged

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 18, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The Qwen3-8B model, developed by Qwen, is an 8.2 billion parameter causal language model with a native context length of 32,768 tokens, extendable to 131,072 tokens with YaRN. This model uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient general-purpose dialogue. It excels in reasoning capabilities, human preference alignment, and agentic tasks, supporting over 100 languages.

Loading preview...

Model Overview

Qwen3-8B is an 8.2 billion parameter causal language model from the Qwen series, featuring a native context length of 32,768 tokens, extendable to 131,072 tokens using the YaRN method. Developed by Qwen, this model introduces a unique capability to seamlessly switch between a 'thinking mode' for complex tasks like logical reasoning, mathematics, and code generation, and a 'non-thinking mode' for general dialogue, optimizing performance across diverse scenarios.

Key Capabilities

  • Dual-Mode Operation: Supports dynamic switching between a reasoning-focused 'thinking mode' and an efficient 'non-thinking mode' within a single model instance.
  • Enhanced Reasoning: Demonstrates significant improvements in mathematical problem-solving, code generation, and commonsense logical reasoning compared to previous Qwen models.
  • Human Preference Alignment: Excels in creative writing, role-playing, multi-turn conversations, and instruction following, providing a more natural and engaging user experience.
  • Agentic Capabilities: Offers strong tool-calling abilities, achieving leading performance among open-source models for complex agent-based tasks, especially when integrated with Qwen-Agent.
  • Multilingual Support: Capable of handling over 100 languages and dialects with robust multilingual instruction following and translation.
  • Long Context Processing: Natively supports 32,768 tokens, with validated performance up to 131,072 tokens using YaRN scaling.

Good For

  • Complex Problem Solving: Ideal for applications requiring advanced logical reasoning, mathematical computations, or code generation, leveraging its 'thinking mode'.
  • Interactive AI: Suitable for chatbots, virtual assistants, and creative content generation where human-like interaction and instruction following are crucial.
  • Agent-Based Systems: Excellent for integrating with external tools and performing complex, multi-step tasks through its agentic capabilities.
  • Multilingual Applications: Recommended for global applications needing strong performance across a wide array of languages and dialects.
  • Long Document Analysis: Effective for tasks involving extensive text, such as summarizing long articles or processing large datasets, due to its extended context window.