AstroMLab/AstroSage-8B

Warm
Public
8B
FP8
32768
Hugging Face
Overview

AstroSage-8B: A Specialized AI for Astronomy Research

AstroSage-8B, developed by AstroMLab, is an 8 billion parameter, domain-specialized natural language AI assistant built upon the Meta-Llama-3.1-8B architecture. It is meticulously trained for applications in astronomy, astrophysics, and cosmology, demonstrating the effectiveness of focused domain specialization.

Key Capabilities & Features

  • Domain Expertise: Tailored for astronomical research, covering topics from arXiv papers (2007-2024), Wikipedia articles, and textbooks.
  • Performance: Achieves an 80.9% score on specialized benchmarks, outperforming all other 8B parameter models and showing comparable performance to GPT-4o.
  • Training Methodology: Utilizes Continued Pre-training (CPT) on 3.3 billion tokens of astronomical literature and Supervised Fine-tuning (SFT) on 8.8 million curated QA pairs, followed by model merging.
  • Cost-Effectiveness: Offers a significantly more cost-effective solution compared to larger proprietary models for its specialized domain.

Intended Use Cases

  • Curiosity-driven question answering in astronomy.
  • Astronomical research assistance and brainstorming.
  • Educational support for complex concepts.
  • Literature review and summarization of scientific papers.

Limitations

  • Training data cutoff is January 2024.
  • Potential for hallucinations, common in LLMs.
  • Performance primarily validated on multiple-choice questions.
  • Primarily trained for English language use.