Technoculture/Medmerge-tulu-70b

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Jan 21, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Medmerge-tulu-70b by Technoculture is a 69 billion parameter merged language model, combining ClinicalCamel-70B, Meditron-70B, and Tulu-2-DPO-70B. This model is specifically designed for medical applications, demonstrating strong performance on various medical benchmarks like MMLU Anatomy and Clinical Knowledge. It leverages a 32768 token context length, making it suitable for processing extensive medical texts and complex clinical queries.

Loading preview...

Medmerge-tulu-70b: A Specialized Medical LLM

Medmerge-tulu-70b is a 69 billion parameter language model developed by Technoculture, created by merging three distinct models: wanglab/ClinicalCamel-70B, epfl-llm/meditron-70b, and allenai/tulu-2-dpo-70b. This strategic merge, utilizing the dare_ties method, aims to combine the strengths of its constituent models, particularly in the medical domain.

Key Capabilities & Performance

This model demonstrates competitive performance on medical benchmarks, inheriting and enhancing the capabilities of its medical-focused components. While its general reasoning scores (ARC, HellaSwag, GSM8K) are slightly lower than the base tulu-2-dpo-70b, it shows improved or comparable results on specific medical MMLU subcategories. For instance, it scores 66.6 on MMLU Anatomy and 72.0 on MMLU Clinical Knowledge, outperforming ClinicalCamel-70B in some areas. The model supports a substantial context length of 32768 tokens, enabling it to handle detailed medical information.

Good For

  • Medical Question Answering: Excels in tasks requiring knowledge from medical datasets like MMLU Anatomy, Clinical Knowledge, and Professional Medicine.
  • Clinical Text Analysis: Its large context window and specialized training make it suitable for processing and understanding extensive clinical documents.
  • Research in Medical AI: Provides a strong foundation for further fine-tuning or research in medical language understanding and generation.