The simplescaling/s1.1-1.5B is a 1.5 billion parameter language model, a fine-tuned version of Qwen2.5-1.5B-Instruct. This model is based on the Qwen2.5 architecture and has a context length of 131072 tokens. It is fine-tuned on the s1K-1.1 dataset, though it has not been formally evaluated. For general use, the developers recommend their s1.1-32B model.
No reviews yet. Be the first to review!