kesavamas/qwen-1.7b-mochi
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 5, 2026Architecture:Transformer Warm

kesavamas/qwen-1.7b-mochi is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B. Developed by kesavamas, this model leverages the TRL library for its training procedure. It is designed for general text generation tasks, building upon the capabilities of the Qwen3 architecture with a 32768 token context length.

Loading preview...