ADELIE-SFT-1.5B: Aligned for Information Extraction
ADELIE-SFT-1.5B is a 1.5 billion parameter language model developed by THU-KEG, specifically designed and aligned for robust Information Extraction (IE) capabilities. Fine-tuned from Qwen2.5-1.5B, this model leverages a high-quality, custom-built alignment corpus called IEInstruct to excel across various IE tasks.
Key Capabilities
- Specialized Information Extraction: Achieves state-of-the-art performance on closed IE, open IE, and on-demand IE tasks among open-source models, significantly outperforming its base model (Qwen2.5 1.5B) and Llama2 7B on these benchmarks.
- Instruction-Tuned: Trained using instruction tuning on the IEInstruct corpus, enabling effective understanding and execution of IE-related prompts.
- Maintained General Capabilities: Despite its specialization, ADELIE-SFT-1.5B demonstrates no significant decline in general language understanding, scoring 55.0% on general benchmarks.
- Extended Context Length: Supports a context length of 32768 tokens, allowing for processing longer documents for information extraction.
Performance Highlights
Compared to Qwen2.5 1.5B, ADELIE-SFT-1.5B shows substantial improvements in IE performance:
- Closed IE: 37.7% F1 (vs. 16.5% for Qwen2.5 1.5B)
- Open IE: 44.6% F1 (vs. 14.2% for Qwen2.5 1.5B)
- On-demand IE: 58.9% F1 (vs. 20.5% for Qwen2.5 1.5B)
This model is ideal for applications requiring precise and efficient information extraction from text, offering a powerful solution within a compact 1.5B parameter footprint. For more details, refer to the ADELIE paper.