CultriX/SeQwence-14Bv1

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Nov 24, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CultriX/SeQwence-14Bv1 is a 14.8 billion parameter language model developed by CultriX, built upon the Qwen2.5-14B architecture with a 32768-token context length. This model is a DARE TIES merge of five specialized models, combining strengths in reasoning, mathematical capabilities, creative tasks, and generalization. It is designed for diverse applications requiring robust performance across various linguistic and cognitive challenges.

Loading preview...

Overview

CultriX/SeQwence-14Bv1 is a 14.8 billion parameter language model created by CultriX using the DARE TIES merge method. It is based on the Qwen/Qwen2.5-14B architecture and integrates capabilities from five distinct pre-trained models to achieve a balanced and robust performance profile. The merge process involved specific weighting and density parameters for each contributing model, optimizing for a broad range of tasks while maintaining memory and computational efficiency through int8 masking and bfloat16 dtype.

Key Capabilities

  • Enhanced Reasoning: Incorporates strengths from models excelling in GPQA, MUSR, and MMLU-PRO benchmarks.
  • Advanced Mathematical Skills: Benefits from models with exceptional IFEval and MATH Level 5 capabilities.
  • Creative and Narrative Generation: Includes contributions designed to enhance creative and narrative task performance.
  • Generalization and Data Diversity: Leverages diverse data and generalization from its constituent models.

Good For

  • Applications requiring strong performance across a variety of benchmarks, including reasoning and mathematical problem-solving.
  • Use cases demanding a balance of factual accuracy, logical inference, and creative text generation.
  • Developers looking for a versatile 14B parameter model with a 32768-token context length, optimized for efficiency.