anna-ssi/Qwen2.5-1.5B-Open-R1-Distill
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kArchitecture:Transformer Warm
anna-ssi/Qwen2.5-1.5B-Open-R1-Distill is a 1.5 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. This model has been trained using the TRL framework, focusing on instruction-following capabilities. With a context length of 131072 tokens, it is designed for general text generation tasks, particularly those requiring adherence to given instructions.
Loading preview...