CultriX/SeQwence-14Bv3
CultriX/SeQwence-14Bv3 is a 14.8 billion parameter language model developed by CultriX, created by merging CultriX/SeQwence-14Bv1, CultriX/SeQwence-14Bv2, and CultriX/Qwestion-14B using the DARE TIES method. This model features a 32768 token context length and demonstrates a balanced performance across various benchmarks, including an average score of 34.41 on the Open LLM Leaderboard. It is designed for general language understanding and generation tasks, with notable scores in instruction following and multi-task reasoning.
Loading preview...
CultriX/SeQwence-14Bv3 Overview
CultriX/SeQwence-14Bv3 is a 14.8 billion parameter language model developed by CultriX, built upon a merge of several pre-trained models. It leverages the DARE TIES merge method, combining CultriX/SeQwence-14Bv1 (as the base model), CultriX/SeQwence-14Bv2, and CultriX/Qwestion-14B to synthesize their capabilities.
Key Characteristics & Performance
This model features a substantial 32768 token context length, making it suitable for processing longer inputs and generating more extensive responses. Its performance on the Open LLM Leaderboard indicates a balanced capability set, with an overall average score of 34.41.
Notable benchmark results include:
- IFEval (0-Shot): 57.19
- BBH (3-Shot): 46.39
- MMLU-PRO (5-shot): 48.17
These scores suggest proficiency in instruction following, multi-task language understanding, and general reasoning.
Use Cases
CultriX/SeQwence-14Bv3 is well-suited for applications requiring:
- General text generation and comprehension: Its merged architecture aims for broad utility.
- Instruction following: Demonstrated by its IFEval score.
- Complex reasoning tasks: Indicated by its performance on BBH and MMLU-PRO.
This model provides a robust foundation for various NLP tasks, benefiting from the combined strengths of its constituent models.