oof-baroomf/csrsef-thinking-20260325T021216Z-it01-pubmedqa
The oof-baroomf/csrsef-thinking-20260325T021216Z-it01-pubmedqa model is a 4 billion parameter language model merged using the NuSLERP method, based on Qwen/Qwen3-4B-Instruct-2507. It integrates components from Qwen/Qwen3-4B-Thinking-2507 and a specialized PubMedQA instruction-tuned model. This model is designed for tasks requiring reasoning and knowledge extraction, particularly within the biomedical domain, leveraging its 32768 token context length.
Loading preview...
Model Overview
The oof-baroomf/csrsef-thinking-20260325T021216Z-it01-pubmedqa is a 4 billion parameter language model created through a merge of pre-trained models using the NuSLERP method. It utilizes Qwen/Qwen3-4B-Instruct-2507 as its base architecture.
Merge Details
This model is a composite of two primary components:
- Qwen/Qwen3-4B-Thinking-2507: Contributes to the model's general reasoning and language understanding capabilities.
/workspace/csrsef/runs/20260325T021216Z/iteration_01/pubmedqa/instruct_merged: A specialized component likely fine-tuned or merged for tasks related to the PubMedQA dataset, indicating a focus on biomedical question answering and knowledge extraction.
Key Characteristics
- Parameter Count: 4 billion parameters.
- Context Length: Supports a substantial context window of 32,768 tokens, enabling processing of longer texts.
- Merge Method: Employs the NuSLERP merge method, which combines the strengths of its constituent models.
Intended Use Cases
Given its merged components, this model is particularly suited for:
- Biomedical Question Answering: Leveraging the PubMedQA-focused component.
- Reasoning Tasks: Benefiting from the 'Thinking' variant of the Qwen3-4B model.
- Long-Context Understanding: Its 32K context length makes it suitable for processing extensive documents or conversations.