lightblue/openorca_stx
The lightblue/openorca_stx is a 13 billion parameter QLoRA fine-tuned model by Lightblue, based on Open-Orca/OpenOrcaxOpenChat-Preview2-13B, with a 4096 token context length. It specializes in Japanese Closed Question Answering, trained on a diverse dataset including SNOW, TyDiQA (Ja), and XLSUM (Ja). This model demonstrates improved performance on Japanese QA benchmarks like JSQuAD, making it suitable for Japanese NLP tasks requiring precise information extraction from provided text.
Loading preview...
lightblue/openorca_stx: Japanese Closed Question Answering Specialist
The lightblue/openorca_stx model is a 13 billion parameter QLoRA fine-tune of the Open-Orca/OpenOrcaxOpenChat-Preview2-13B base model, developed by Lightblue. It is specifically optimized for Closed Question Answering in Japanese, meaning it excels at answering questions based on a provided reference text.
Key Capabilities & Training:
- Japanese Language Focus: Fine-tuned exclusively on Japanese datasets, enhancing its proficiency in handling Japanese text.
- Specialized QA: Demonstrates significant improvement in Japanese Closed Question Answering, as evidenced by its score of 0.836 on the JSQuAD-1.1-0.3 benchmark, outperforming its base model's 0.692.
- Diverse Training Data: Trained on a combined dataset of 13,167 samples from SNOW (text simplification), TyDiQA (Ja) (question answering), and XLSUM (Ja) (text summarization). This diverse training regimen, represented by the "STX" in its name, aims to improve general Japanese data suitability.
- Efficient Fine-tuning: Achieved its specialized performance through minimal QLoRA fine-tuning (1000 steps, 1.2 epochs) without substantial degradation in other multi-choice question benchmarks like JCommonSenseQA and MARC-Ja.
Ideal Use Cases:
- Japanese Information Extraction: When you need to extract precise answers from Japanese documents or articles.
- Contextual QA Systems: Building applications that require answering questions strictly based on provided Japanese text.
- Research & Development: Exploring the potential of applying strong base language models to narrow Japanese NLP tasks with efficient fine-tuning methods.