genshiai-daichi/med-lfm2.5-1.2b-autocomplete
The genshiai-daichi/med-lfm2.5-1.2b-autocomplete is a 1.2 billion parameter Japanese medical autocomplete language model, based on the LiquidAI/LFM2.5-1.2B-Base architecture. It has undergone a multi-phase supervised fine-tuning pipeline, specifically adapted for medical text completion and question-answering. This model excels at generating continuations for medical sentences and providing concise answers, making it suitable for applications requiring Japanese medical text auto-completion with a context length of 32768 tokens.
Loading preview...
Overview
This model, med-lfm2.5-1.2b-autocomplete, is a 1.2 billion parameter Japanese medical autocomplete language model developed by genshiai-daichi. It is built upon the LiquidAI/LFM2.5-1.2B-Base architecture and has been progressively fine-tuned through a specialized pipeline to adapt it for medical applications.
Key Capabilities
- Medical Autocompletion: Optimized to complete Japanese medical sentences, learning to generate continuations given a prefix.
- Medical QA Adaptation: Incorporates a QA-Completion phase where it learns from medical Q&A datasets, specifically focusing on answer generation to avoid becoming a full Q&A bot.
- Instruction Following (Minimal): A final QA-SFT phase adds minimal instruction-following capabilities while preserving its primary autocomplete function.
- Robust Training: Utilizes
bf16precision for stable training and includes pre-processing steps like NFKC normalization and removal of OCR-derived whitespace.
Good For
This model is ideal for use cases requiring Japanese medical text auto-completion. Its phased training ensures it can provide relevant and contextually appropriate continuations for medical phrases and sentences, with a foundational understanding derived from medical Q&A data. Developers can access different training phases (cpt, comp, qa-comp, qa) via revision tags in transformers.