jpacifico/Chocolatine-14B-Instruct-DPO-v1.3
jpacifico/Chocolatine-14B-Instruct-DPO-v1.3 is a 14.7 billion parameter instruction-tuned causal language model developed by jpacifico, fine-tuned from Microsoft's Phi-4 architecture. This model excels in French language tasks and general capabilities, demonstrating significant performance improvements over its base model and previous versions on benchmarks like MT-Bench-French. With a context window of up to 16,000 tokens, it offers strong performance for its size, particularly noted for its efficiency on the OpenLLM Leaderboard.
Loading preview...
Chocolatine-14B-Instruct-DPO-v1.3: Enhanced French and General LLM
Chocolatine-14B-Instruct-DPO-v1.3 is a 14.7 billion parameter language model developed by jpacifico, built upon Microsoft's Phi-4 architecture. It has undergone DPO (Direct Preference Optimization) fine-tuning using the jpacifico/french-orca-dpo-pairs-revised dataset, which has significantly boosted its performance in French and overall capabilities.
Key Capabilities & Performance
- Superior French Performance: Outperforms its base model, Phi-4, and previous Chocolatine versions on the MT-Bench-French benchmark, demonstrating strong multi-turn conversational abilities in French.
- OpenLLM Leaderboard Recognition: Achieves a notable average score of 42.42 on the OpenLLM Leaderboard, making it the best-performing Phi-4 based model. It is also highlighted for its energy efficiency, with a low carbon footprint (1.70kgCo2).
- Extended Context Window: Supports a context length of up to 16,000 tokens, allowing for processing longer inputs and generating more coherent, extended responses.
- Multilingual Support: While primarily enhanced for French, the model also supports English.
When to Use This Model
- French Language Applications: Ideal for chatbots, content generation, and conversational AI systems requiring high proficiency in French.
- Resource-Efficient Deployment: Suitable for scenarios where strong performance is needed from a 14B parameter model, especially considering its reported efficiency.
- Instruction-Following Tasks: Excels in tasks requiring adherence to instructions due to its DPO fine-tuning.
Limitations
This model series is a demonstration of effective fine-tuning. It does not include any built-in moderation mechanisms.