KrithikV/MedMobile
MedMobile by KrithikV is a 4 billion parameter instruction-tuned language model, fine-tuned from Microsoft's Phi-3-mini-4k-instruct on the UltraMedical dataset. With a 4096-token context length, this model is specifically optimized for medical domain applications, demonstrating a final validation loss of 0.7358. Its primary use case is to provide specialized language understanding and generation capabilities for medical contexts.
Loading preview...
MedMobile: A Medical Domain LLM
MedMobile is a 4 billion parameter language model developed by KrithikV, specifically fine-tuned from the microsoft/Phi-3-mini-4k-instruct base model. This specialization is achieved through training on the UltraMedical dataset, making it highly relevant for applications requiring medical domain knowledge. The model maintains a context length of 4096 tokens.
Key Capabilities
- Medical Domain Specialization: Optimized for understanding and generating text within medical contexts.
- Efficient Size: At 4 billion parameters, it offers a balance between performance and computational efficiency.
- Instruction-Tuned: Designed to follow instructions effectively for various tasks.
Good for
- Developing applications that require medical text analysis.
- Research in medical language processing.
- Use cases where a smaller, specialized model is preferred over larger, general-purpose LLMs for medical tasks.
During training, the model achieved a final validation loss of 0.7358, using a learning rate of 0.0001 over 3 epochs with an Adam optimizer. More details on the training procedure can be found in the associated manuscript: https://arxiv.org/abs/2410.09019.