DUTIR-BioNLP/Taiyi-LLM
Taiyi-LLM is a 7 billion parameter bilingual (Chinese and English) large language model developed by DUTIR, fine-tuned from Qwen-7b-base. It specializes in diverse biomedical natural language processing tasks, including question-answering, dialogue, report generation, and information extraction. The model leverages a comprehensive collection of 38 Chinese and 131 English biomedical datasets, making it highly effective for multi-task BioNLP applications.
Loading preview...
Taiyi-LLM: Bilingual Biomedical Language Model
Taiyi-LLM, developed by DUTIR, is a 7 billion parameter large language model built upon the Qwen-7b-base architecture. It is specifically fine-tuned for diverse biomedical natural language processing (BioNLP) tasks in both Chinese and English. The project addresses the scarcity of open-source bilingual biomedical models, aiming to provide a robust solution for healthcare professionals and researchers.
Key Capabilities
- Extensive Biomedical Training: Utilizes a rich collection of 38 Chinese and 131 English BioNLP datasets, covering a wide array of tasks.
- Bilingual Multi-Task Proficiency: Excels in various BioNLP tasks, including intelligent biomedical question-answering, doctor-patient dialogues, report generation, information extraction, machine translation, and text classification.
- Instruction-Tuned: Fine-tuned on over 1 million bilingual Chinese-English instruction samples to enhance its multi-task performance.
- Open-Source Resources: Provides open-source details on dataset curation, model weights, and inference deployment scripts.
Good For
- Biomedical Research: Assisting in knowledge discovery and information extraction from biomedical texts.
- Clinical Applications: Supporting diagnosis, report generation, and personalized healthcare solutions.
- Multilingual BioNLP: Handling complex biomedical language tasks in both Chinese and English environments.
- Developers: Building applications requiring specialized biomedical language understanding and generation.