parallel-reasoner/Qwen3-8B-131072-sft-tw8x

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 13, 2026Architecture:Transformer Warm

The parallel-reasoner/Qwen3-8B-131072-sft-tw8x is an 8 billion parameter causal language model, a variant of Qwen3-8B, developed by parallel-reasoner. It features an extended context length of 131072 tokens, making it suitable for tasks requiring extensive context processing. This model is specifically fine-tuned for specialized applications, leveraging its large context window for advanced reasoning tasks.

Loading preview...

Model Overview

parallel-reasoner/Qwen3-8B-131072-sft-tw8x is an 8 billion parameter causal language model derived from the Qwen/Qwen3-8B architecture. This specific variant, developed by parallel-reasoner, is notable for its significantly extended maximum position embeddings of 131072 tokens, enabling it to process exceptionally long contexts.

Key Characteristics

  • Base Architecture: Qwen3-8B, specifically Qwen3ForCausalLM.
  • Extended Context Window: Supports a massive context length of 131072 tokens, a key differentiator for applications requiring deep contextual understanding.
  • Training Details: The model was fine-tuned using flex_attention during training, indicating an optimized approach for handling its large context window efficiently.
  • Inference Ready: The repository provides exported model weights, tokenizer files, and generation configuration, ready for immediate deployment.

Use Cases

This model is particularly well-suited for applications that benefit from processing and understanding very long documents, conversations, or codebases. Its extended context window makes it ideal for:

  • Advanced reasoning over extensive textual data.
  • Summarization of lengthy articles or reports.
  • Complex question-answering requiring broad contextual recall.
  • Code analysis and generation within large projects.