Poro 2 8B SFT: A Research-Focused Instruction-Following Model
LumiOpen's Poro 2 8B SFT is an 8 billion parameter supervised fine-tuned (SFT) model built upon the Llama 3.1 8B architecture. It is an intermediate checkpoint in the Poro 2 model family, specifically designed for instruction following and conversational AI in both Finnish and English. This model has not undergone preference tuning (DPO), making it a valuable resource for researchers studying the impact of different post-training methodologies.
Key Capabilities & Features
- Bilingual Proficiency: Supports instruction following and conversation in both English and Finnish.
- Supervised Fine-Tuning: Trained on 1.4 million instruction-following examples, including Tulu 3 prompts, multi-turn conversations, and translation samples.
- Improved Finnish Performance: Shows substantial improvements in Finnish instruction-following benchmarks (e.g., IFEval Finnish, MTBench Finnish, AlpacaEval 2 Finnish) compared to Llama 3.1 8B Instruct.
- Maintained English Performance: Retains strong performance in English instruction-following tasks.
- Llama 3.1 Base: Benefits from the robust foundation of the Llama 3.1 8B model.
- 8192 Token Context: Features a maximum sequence length of 8192 tokens.
Ideal Use Cases
This model is primarily intended for:
- Research: Studying the effects of supervised fine-tuning versus preference tuning, and comparative analysis of post-training techniques.
- Ablation Studies: Investigating the contribution of different training phases to instruction-following capabilities.
- Educational Applications: Learning about the development process of instruction-following models.
- Development: Serving as a starting point for further preference tuning experiments. For production use, the DPO-tuned Poro 2 8B Instruct is recommended.