ik-ram28/MedMistralInstruct-CPT-SFT-7B
MedMistralInstruct-CPT-SFT-7B is a 7 billion parameter French medical language model developed by ik-ram28. It is based on Mistral-7B-Instruct-v0.1 and has been continually pre-trained on the NACHOS corpus of French medical texts, then supervised fine-tuned on 30K French medical question-answer pairs. This model is specialized for medical and healthcare domain applications in French, offering enhanced performance for tasks requiring domain-specific knowledge.
Loading preview...
MedMistralInstruct-CPT-SFT-7B Overview
MedMistralInstruct-CPT-SFT-7B is a 7 billion parameter causal language model specifically designed for the French medical domain. Developed by ik-ram28, it builds upon the Mistral-7B-Instruct-v0.1 architecture, undergoing a two-stage adaptation process to specialize its capabilities.
Key Capabilities & Training
- Domain Specialization: The model has been continually pre-trained (CPT) on the 7.4 GB NACHOS corpus, a large collection of French medical texts, for 2.8 epochs. This extensive pre-training ensures a deep understanding of medical terminology and concepts in French.
- Instruction Following: Following CPT, the model underwent Supervised Fine-Tuning (SFT) using 30,000 French medical question-answer pairs. This fine-tuning, performed with DoRA (Weight-Decomposed Low-Rank Adaptation), enhances its ability to follow instructions and generate relevant responses in a medical context.
- Language: Exclusively focused on the French language, making it a specialized tool for French-speaking medical applications.
- Base Model: Leverages the robust Mistral-7B-Instruct-v0.1 as its foundation.
Intended Use and Considerations
This model is primarily intended for research and educational purposes within the medical and healthcare domains. It is crucial to note that any outputs from this model require verification by qualified medical professionals due to the critical nature of medical information. Users should also be aware of potential biases inherited from the training data and are advised against inputting private health information.