m3rg-iitd/llamat-2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 3, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

LLaMat-2 is a 7 billion parameter large language model developed by M3RG and DAIR at IIT Delhi, specifically designed as a foundational model for materials science. It is a continued pretraining of LLaMA-3 on materials science data, making it specialized for tasks like information extraction, table understanding, and scientific data parsing within this domain. This model excels at processing and generating content relevant to materials research, serving as a core component for applications such as a Materials Copilot.

Loading preview...

LLaMat-2: A Specialized LLM for Materials Science

LLaMat-2 is a 7 billion parameter large language model developed by M3RG and DAIR at IIT Delhi. It is built upon the LLaMA-3 architecture, undergoing continued pretraining specifically on materials science data to become a foundational model for this domain. This specialization allows it to understand and process complex scientific information relevant to materials research.

Key Capabilities

  • Domain Adaptation: Optimized for the unique language and data structures found in materials science.
  • Information Extraction: Capable of extracting specific data points from scientific texts.
  • Table Understanding: Designed to interpret and parse data presented in tabular formats.
  • Scientific Data Parsing: Facilitates the parsing of various types of scientific data for research tasks.
  • Crystal Structure Generation: Can be fine-tuned for generating crystal structures.

Good For

  • Developing applications like a "Materials Copilot" for researchers.
  • Fine-tuning for specific materials science information retrieval and generation tasks.
  • Research involving automated analysis of materials science literature and data.

The model was developed with compute support from the Edinburgh International Data Facility (EIDF) using Cerebras CS2 clusters for pretraining, and IIT Delhi's HPC cluster for fine-tuning and inference.