lightblue/openorca_stx: Japanese Closed Question Answering Specialist
The lightblue/openorca_stx model is a 13 billion parameter QLoRA fine-tune of the Open-Orca/OpenOrcaxOpenChat-Preview2-13B base model, developed by Lightblue. It is specifically optimized for Closed Question Answering in Japanese, meaning it excels at answering questions based on a provided reference text.
Key Capabilities & Training:
- Japanese Language Focus: Fine-tuned exclusively on Japanese datasets, enhancing its proficiency in handling Japanese text.
- Specialized QA: Demonstrates significant improvement in Japanese Closed Question Answering, as evidenced by its score of 0.836 on the JSQuAD-1.1-0.3 benchmark, outperforming its base model's 0.692.
- Diverse Training Data: Trained on a combined dataset of 13,167 samples from SNOW (text simplification), TyDiQA (Ja) (question answering), and XLSUM (Ja) (text summarization). This diverse training regimen, represented by the "STX" in its name, aims to improve general Japanese data suitability.
- Efficient Fine-tuning: Achieved its specialized performance through minimal QLoRA fine-tuning (1000 steps, 1.2 epochs) without substantial degradation in other multi-choice question benchmarks like JCommonSenseQA and MARC-Ja.
Ideal Use Cases:
- Japanese Information Extraction: When you need to extract precise answers from Japanese documents or articles.
- Contextual QA Systems: Building applications that require answering questions strictly based on provided Japanese text.
- Research & Development: Exploring the potential of applying strong base language models to narrow Japanese NLP tasks with efficient fine-tuning methods.