ncbi/Cell-o1

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 3, 2025License:otherArchitecture:Transformer0.0K Cold

Cell-o1 is a reasoning-enhanced language model developed by ncbi, specifically designed to solve single-cell reasoning puzzles. It was trained via supervised fine-tuning on distilled expert traces, followed by reinforcement learning with batch-level rewards. This model excels at cell type annotation by considering batch-level cellular context and providing explanatory reasoning, outperforming baselines on both cell-level and batch-level metrics. Cell-o1 exhibits emergent behaviors such as self-reflection and curriculum reasoning, making it suitable for complex biological data analysis.

Loading preview...

Cell-o1: Solving Single-Cell Reasoning Puzzles

Cell-o1 is a specialized language model developed by ncbi to address the complex task of cell type annotation in single-cell RNA sequencing data. Unlike traditional methods that annotate cells independently, Cell-o1 mimics human expert behavior by considering batch-level cellular context and providing detailed reasoning for its assignments.

Key Capabilities

  • Batch-level Reasoning: Annotates distinct cell types for different cell clusters, taking into account the overall cellular context within a batch.
  • Enhanced Accuracy: Outperforms existing LLMs, including OpenAI's o1, on the challenging CellPuzzles benchmark, achieving higher accuracy on both cell-level and batch-level metrics.
  • Expert Mimicry: Trained using supervised fine-tuning on distilled expert traces and further refined with reinforcement learning, enabling it to emulate expert reasoning processes.
  • Emergent Behaviors: Demonstrates advanced capabilities such as self-reflection and curriculum reasoning, offering insights into its interpretability and generalization.
  • Structured Input Processing: Designed to process structured system and user messages containing gene expression data and candidate cell types for precise annotation.

Good for

  • Single-Cell RNA Sequencing Analysis: Ideal for researchers and developers working with single-cell data who require accurate and context-aware cell type annotation.
  • Reasoning-Based Annotation: Suitable for tasks where explanatory reasoning and consideration of batch-level context are crucial for reliable biological insights.
  • Biomedical Research: Applicable in scenarios demanding high-precision cell classification and understanding of cellular heterogeneity.