The akshayballal/Qwen3-1.7B-Pubmed-16bit-GRPO is a 1.7 billion parameter Qwen3-based language model developed by akshayballal. It was fine-tuned from unsloth/qwen3-1.7b-unsloth-bnb-4bit using Unsloth and Huggingface's TRL library, enabling 2x faster training. This model is specifically optimized for biomedical text processing, making it suitable for applications requiring understanding and generation of content from sources like PubMed.
No reviews yet. Be the first to review!