ricemonster/qwen2.5-3B-SFT
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 22, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

The ricemonster/qwen2.5-3B-SFT model is a 3.1 billion parameter language model based on the Qwen2.5 architecture, developed by ricemonster. It features a substantial context length of 32768 tokens, making it suitable for processing extensive inputs. This model is specifically fine-tuned (SFT) for general language understanding and generation tasks, providing a versatile foundation for various NLP applications.

Loading preview...