trl-lib/qwen1.5-0.5b-sft is a 0.6 billion parameter causal language model, fine-tuned from Qwen/Qwen1.5-0.5B by trl-lib. It was trained on the HuggingFaceH4/deita-6k-v0-sft dataset with a context length of 32768 tokens. This model is optimized for instruction-following tasks, demonstrating a validation loss of 1.2566, making it suitable for applications requiring efficient, small-scale language understanding.
No reviews yet. Be the first to review!