akshayballal/Qwen2.5-3B-Instruct-Pubmed-16bit-GRPO

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Jan 18, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

akshayballal/Qwen2.5-3B-Instruct-Pubmed-16bit-GRPO is a 3.1 billion parameter instruction-tuned causal language model developed by akshayballal. Finetuned from unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit, this model was trained using Unsloth and Huggingface's TRL library for enhanced efficiency. With a 32768 token context length, it is optimized for tasks requiring instruction following, particularly within the biomedical domain due to its 'Pubmed' designation.

Loading preview...

Model Overview

akshayballal/Qwen2.5-3B-Instruct-Pubmed-16bit-GRPO is a 3.1 billion parameter instruction-tuned language model. Developed by akshayballal, this model is a finetuned version of unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit and was trained with a focus on efficiency using the Unsloth library and Huggingface's TRL library. It features a substantial context length of 32768 tokens, making it suitable for processing longer inputs.

Key Characteristics

  • Base Model: Finetuned from Qwen2.5-3B-Instruct, indicating strong general instruction-following capabilities.
  • Training Efficiency: Leverages Unsloth for 2x faster training, suggesting an optimized and efficient finetuning process.
  • Context Length: Supports a 32768 token context window, beneficial for tasks requiring extensive contextual understanding.
  • Domain Specialization: The 'Pubmed' in its name implies a potential specialization or finetuning on biomedical literature, making it particularly relevant for tasks within that domain.

Potential Use Cases

  • Biomedical Text Analysis: Ideal for tasks such as information extraction, summarization, or question answering on scientific papers and medical records, given its 'Pubmed' designation.
  • Instruction Following: Capable of executing a wide range of instructions due to its instruction-tuned nature.
  • Long Context Tasks: Suitable for applications requiring the processing and understanding of lengthy documents or conversations.