Name: sociocom/MedPHINER-Llama-3.1-Swallow-8B-Instruct-v0.5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sociocom

MedPHINER-Llama-3.1-Swallow-8B-Instruct-v0.5: Japanese Medical PHI Tagging Model

This model, developed by sociocom, is an 8 billion parameter language model built upon the Llama-3.1-Swallow-8B-Instruct-v0.5 base. It has been specifically fine-tuned using LoRA for the task of identifying and tagging Protected Health Information (PHI) within Japanese medical texts.

Key Capabilities

PHI Tagging: Accurately identifies and assigns specific tags to various types of personal health information in Japanese.
PHI Categories: Recognizes and tags:
- <phi_age>: Age
- <phi_id>: Identification numbers
- <phi_tel>: Telephone numbers
- <phi_job>: Occupations
- <phi_location>: Addresses and place names
- <phi_person>: Person names
- <phi_hospital>: Medical institution names
Japanese Medical Context: Optimized for the nuances of Japanese medical language and data.

Training Details

The model was fine-tuned on 11,127 sentences from the NTCIR dataset. Personal information insertion and annotation for the training data were performed using OpenAI API (gpt-5.2-2025-12-11). Training utilized LoRA with a rank of 16, alpha of 64, and a dropout of 0.05 over 5 epochs, with a batch size of 8 and a learning rate of 2e-4 using the AdamW optimizer.

Good For

Automated PHI Redaction: Ideal for applications requiring the automatic identification and masking of sensitive patient data in Japanese medical records.
Data Anonymization: Useful for preparing medical datasets for research or sharing while ensuring patient privacy.
Compliance: Supports efforts to comply with data privacy regulations in healthcare by accurately pinpointing PHI.