guangshuo/CellReasoner-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 18, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CellReasoner-7B by guangshuo is a 7.6 billion parameter large language model, built on Qwen2.5-7B-Instruct, specifically enhanced for biological reasoning. It excels at zero-shot and few-shot cell type annotation for single-cell RNA-seq (scRNA-seq) and scATAC-seq data, demonstrating superior performance in interpretability and generalization. This model is optimized for marker-by-marker annotation, ontology mapping, and biological reasoning, requiring only a few expert-level reasoning samples for activation.

Loading preview...

CellReasoner-7B: Reasoning-Enhanced Cell Type Annotation

CellReasoner-7B is a specialized 7.6 billion parameter large language model developed by guangshuo, fine-tuned from Qwen2.5-7B-Instruct, designed for advanced cell type annotation. Its core innovation lies in its ability to activate expert-level biological reasoning with minimal supervision, requiring only a few expert-level reasoning samples.

Key Capabilities

  • Expert-Level Interpretability: Provides clear, reasoning-based explanations for cell type assignments.
  • Zero-/Few-Shot Generalization: Achieves high accuracy on unseen datasets with limited or no prior examples.
  • Superior Performance: Outperforms general-purpose LLMs like Deepseek and ChatGPT, as well as traditional methods like singleR, on scRNA-seq (e.g., PBMC3K, PDAC datasets) and scATAC-seq data.
  • Versatile Annotation: Supports marker-by-marker annotation, ontology mapping, and complex biological reasoning tasks.
  • Scalable & Efficient: Delivers accurate and interpretable cell annotation with minimal data requirements.

Good For

  • Biomedical Researchers: Annotating cell types in single-cell sequencing data (scRNA-seq, scATAC-seq).
  • Computational Biologists: Developing and evaluating reasoning-enhanced models for biological applications.
  • Drug Discovery: Identifying specific cell populations relevant to disease mechanisms or therapeutic targets.

CellReasoner-7B is part of a model zoo that also includes CellReasoner-32B, built on QwQ-32B, offering a range of capabilities for diverse research needs. The model leverages the LLaMA-Factory framework for efficient fine-tuning.