MOOSE-Star-R1D-7B: A Multi-Task Model for Scientific Discovery
MOOSE-Star-R1D-7B (MS-7B) is a 7.6 billion parameter language model developed by ZonglinY, specifically fine-tuned for critical tasks in scientific discovery: inspiration retrieval (IR) and hypothesis composition (HC). Based on the DeepSeek-R1-Distill-Qwen-7B architecture, this model uniquely combines these capabilities into a single, unified system.
Key Capabilities
- Inspiration Retrieval (IR): Selects the most relevant cross-paper inspiration from 15 candidates, achieving 54.34% accuracy, matching the performance of dedicated single-task IR models.
- Hypothesis Composition (HC): Generates structured delta hypotheses from new inspiration papers, outperforming all single-task HC variants, including those with bounded composition augmentation.
- Robustness: Demonstrates improved performance in hypothesis composition under varying levels of inspiration noise, indicating effective transfer of IR reasoning skills.
- Unified Workflow: Streamlines scientific discovery by handling both inspiration identification and hypothesis generation within one model.
Good for
- Scientific Research: Automating and assisting in the early stages of scientific inquiry.
- Literature Review: Identifying relevant research papers and synthesizing new ideas from existing knowledge.
- Hypothesis Generation: Structuring and formulating new hypotheses based on provided research questions, background, and inspirational papers.
- Academic Applications: Tools requiring advanced reasoning over scientific texts and structured output generation.