DUTIR-BioNLP/Taiyi-LLM

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:32kPublished:Oct 21, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Taiyi-LLM is a 7 billion parameter bilingual (Chinese and English) large language model developed by DUTIR, fine-tuned from Qwen-7b-base. It specializes in diverse biomedical natural language processing tasks, including question-answering, dialogue, report generation, and information extraction. The model leverages a comprehensive collection of 38 Chinese and 131 English biomedical datasets, making it highly effective for multi-task BioNLP applications.

Loading preview...

Taiyi-LLM: Bilingual Biomedical Language Model

Taiyi-LLM, developed by DUTIR, is a 7 billion parameter large language model built upon the Qwen-7b-base architecture. It is specifically fine-tuned for diverse biomedical natural language processing (BioNLP) tasks in both Chinese and English. The project addresses the scarcity of open-source bilingual biomedical models, aiming to provide a robust solution for healthcare professionals and researchers.

Key Capabilities

  • Extensive Biomedical Training: Utilizes a rich collection of 38 Chinese and 131 English BioNLP datasets, covering a wide array of tasks.
  • Bilingual Multi-Task Proficiency: Excels in various BioNLP tasks, including intelligent biomedical question-answering, doctor-patient dialogues, report generation, information extraction, machine translation, and text classification.
  • Instruction-Tuned: Fine-tuned on over 1 million bilingual Chinese-English instruction samples to enhance its multi-task performance.
  • Open-Source Resources: Provides open-source details on dataset curation, model weights, and inference deployment scripts.

Good For

  • Biomedical Research: Assisting in knowledge discovery and information extraction from biomedical texts.
  • Clinical Applications: Supporting diagnosis, report generation, and personalized healthcare solutions.
  • Multilingual BioNLP: Handling complex biomedical language tasks in both Chinese and English environments.
  • Developers: Building applications requiring specialized biomedical language understanding and generation.