oof-baroomf/csrsef-thinking-20260325T021216Z-it01-pubmedqa

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Warm

The oof-baroomf/csrsef-thinking-20260325T021216Z-it01-pubmedqa model is a 4 billion parameter language model merged using the NuSLERP method, based on Qwen/Qwen3-4B-Instruct-2507. It integrates components from Qwen/Qwen3-4B-Thinking-2507 and a specialized PubMedQA instruction-tuned model. This model is designed for tasks requiring reasoning and knowledge extraction, particularly within the biomedical domain, leveraging its 32768 token context length.

Loading preview...

Model Overview

The oof-baroomf/csrsef-thinking-20260325T021216Z-it01-pubmedqa is a 4 billion parameter language model created through a merge of pre-trained models using the NuSLERP method. It utilizes Qwen/Qwen3-4B-Instruct-2507 as its base architecture.

Merge Details

This model is a composite of two primary components:

  • Qwen/Qwen3-4B-Thinking-2507: Contributes to the model's general reasoning and language understanding capabilities.
  • /workspace/csrsef/runs/20260325T021216Z/iteration_01/pubmedqa/instruct_merged: A specialized component likely fine-tuned or merged for tasks related to the PubMedQA dataset, indicating a focus on biomedical question answering and knowledge extraction.

Key Characteristics

  • Parameter Count: 4 billion parameters.
  • Context Length: Supports a substantial context window of 32,768 tokens, enabling processing of longer texts.
  • Merge Method: Employs the NuSLERP merge method, which combines the strengths of its constituent models.

Intended Use Cases

Given its merged components, this model is particularly suited for:

  • Biomedical Question Answering: Leveraging the PubMedQA-focused component.
  • Reasoning Tasks: Benefiting from the 'Thinking' variant of the Qwen3-4B model.
  • Long-Context Understanding: Its 32K context length makes it suitable for processing extensive documents or conversations.