float-trip/qwen-3-14b-drama
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Jul 14, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Qwen3-14B-Base is a 14.8 billion parameter causal language model from the Qwen3 series, pre-trained on 36 trillion tokens across 119 languages with an expanded, high-quality corpus. It incorporates architectural refinements like qk layernorm and a three-stage pre-training process to enhance general knowledge, reasoning skills (STEM, coding, logical reasoning), and long-context comprehension up to 32,768 tokens. This model is designed for broad language modeling and general knowledge acquisition, leveraging advanced training techniques for improved stability and performance.

Loading preview...