KrithikV/MedMobile

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:4kPublished:Aug 20, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

MedMobile by KrithikV is a 4 billion parameter instruction-tuned language model, fine-tuned from Microsoft's Phi-3-mini-4k-instruct on the UltraMedical dataset. With a 4096-token context length, this model is specifically optimized for medical domain applications, demonstrating a final validation loss of 0.7358. Its primary use case is to provide specialized language understanding and generation capabilities for medical contexts.

Loading preview...

MedMobile: A Medical Domain LLM

MedMobile is a 4 billion parameter language model developed by KrithikV, specifically fine-tuned from the microsoft/Phi-3-mini-4k-instruct base model. This specialization is achieved through training on the UltraMedical dataset, making it highly relevant for applications requiring medical domain knowledge. The model maintains a context length of 4096 tokens.

Key Capabilities

  • Medical Domain Specialization: Optimized for understanding and generating text within medical contexts.
  • Efficient Size: At 4 billion parameters, it offers a balance between performance and computational efficiency.
  • Instruction-Tuned: Designed to follow instructions effectively for various tasks.

Good for

  • Developing applications that require medical text analysis.
  • Research in medical language processing.
  • Use cases where a smaller, specialized model is preferred over larger, general-purpose LLMs for medical tasks.

During training, the model achieved a final validation loss of 0.7358, using a learning rate of 0.0001 over 3 epochs with an Adam optimizer. More details on the training procedure can be found in the associated manuscript: https://arxiv.org/abs/2410.09019.