google/txgemma-2b-predict

TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Mar 21, 2025License:health-ai-developer-foundationsArchitecture:Transformer0.1K Gated Cold

TxGemma is a collection of lightweight, state-of-the-art open language models from Google, built upon Gemma 2 and fine-tuned for therapeutic development. This 2.6 billion parameter variant is designed to process and understand information related to various therapeutic modalities and targets, excelling at tasks such as property prediction. It serves as a foundation for further fine-tuning or as an interactive agent for drug discovery, with a focus on processing text prompts formatted according to the Therapeutics Data Commons (TDC) structure.

Loading preview...

TxGemma-2b-predict: A Specialized LLM for Therapeutic Development

TxGemma-2b-predict is a 2.6 billion parameter model from Google, part of the TxGemma family built on the Gemma 2 architecture. It is specifically fine-tuned for therapeutic development, processing and understanding information related to various therapeutic modalities and targets like small molecules, proteins, and diseases. This model excels at property prediction tasks and can serve as a foundational model for further fine-tuning in specialized use cases.

Key Capabilities

  • Therapeutic Task Performance: Exhibits strong performance across a wide range of therapeutic tasks, outperforming or matching best-in-class performance on 50 out of 66 benchmarks from the Therapeutics Data Commons (TDC).
  • Data Efficiency: Demonstrates competitive performance even with limited data, offering improvements over its predecessors.
  • Foundation for Fine-tuning: Can be used as a pre-trained foundation for specialized therapeutic applications.
  • Input/Output: Optimized for text inputs, particularly those formatted according to TDC structure, including instructions, context, and questions. Inputs can include SMILES strings, amino acid sequences, nucleotide sequences, and natural language text, with text outputs.

Good For

  • Accelerated Drug Discovery: Streamlining therapeutic development by predicting properties of therapeutics and targets for tasks like target identification, drug-target interaction prediction, and clinical trial approval prediction.
  • Research and Development: A valuable tool for researchers in the health AI domain, especially for tasks requiring deep understanding of therapeutic data.
  • Specialized Fine-tuning: Serving as a base model for fine-tuning with private or highly specific therapeutic datasets.