xw1234gan/Fixed_Merging_Qwen2.5-3B-Instruct_MedQA_lr1e-05_mb2_ga128_n2048_seed42
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Warm

The xw1234gan/Fixed_Merging_Qwen2.5-3B-Instruct_MedQA_lr1e-05_mb2_ga128_n2048_seed42 model is a 3.1 billion parameter instruction-tuned language model based on the Qwen2.5 architecture. This model is a merged version, likely optimized for specific tasks or performance characteristics. With a context length of 32768 tokens, it is designed for applications requiring processing of extensive textual inputs. Its primary strength lies in its instruction-following capabilities, making it suitable for various natural language processing tasks.

Loading preview...