xw1234gan/Extended_Merging_Qwen2.5-3B-Instruct_MATH_lr1e-05_mb2_ga128_n2048_seed42
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 29, 2026Architecture:Transformer Loading
The xw1234gan/Extended_Merging_Qwen2.5-3B-Instruct_MATH_lr1e-05_mb2_ga128_n2048_seed42 is a 3.1 billion parameter instruction-tuned language model based on the Qwen2.5 architecture. This model is specifically designed and fine-tuned for mathematical reasoning and problem-solving tasks. It leverages an extended merging technique to enhance its capabilities in numerical and logical operations. Its primary strength lies in handling complex mathematical queries and generating accurate solutions.
Loading preview...