CultriX/Qwen2.5-14B-BrocaV9

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Jan 2, 2025Architecture:Transformer0.0K Cold

CultriX/Qwen2.5-14B-BrocaV9 is a 14.8 billion parameter language model created by CultriX using the della_linear merge method on a Qwen2.5-14B-Wernickev3 base. This model integrates several specialized Qwen-based models, with a strong emphasis on enhancing performance across various reasoning tasks, including mathematical reasoning, instruction following, and complex problem-solving. It is primarily designed for applications requiring robust logical and analytical capabilities over a 32768-token context.

Loading preview...

Overview

CultriX/Qwen2.5-14B-BrocaV9 is a 14.8 billion parameter merged language model developed by CultriX. It was created using the della_linear merge method, building upon CultriX/Qwen2.5-14B-Wernickev3 as its base. This model integrates several specialized Qwen-based models, including CultriX/SeQwence-14Bv1, allknowingroger/QwenSlerp6-14B, qingy2019/Qwen2.5-Math-14B-Instruct, CultriX/Qwenfinity-2.5-14B, and djuna/Q2.5-Veltha-14B-0.5.

Key Capabilities

  • Enhanced Reasoning: The merge configuration prioritizes performance on benchmarks like tinyArc (logical reasoning), tinyMMLU (domain knowledge), tinyTruthfulQA (truthful reasoning), and tinyWinogrande (advanced reasoning).
  • Strong Mathematical Skills: Features a high task weight for MATH, indicating a focus on mathematical reasoning, partly due to the inclusion of qingy2019/Qwen2.5-Math-14B-Instruct.
  • Improved Instruction Following: Emphasizes instruction-following and multitasking, with a significant task weight for IFEval.
  • Complex Problem Solving: Tuned for complex reasoning tasks, as evidenced by its weighting for BBH (Big-Bench Hard) and MUSR (Multi-step Reasoning).

Good for

  • Applications requiring strong logical and analytical reasoning.
  • Tasks involving mathematical problem-solving and factual question answering.
  • Scenarios where precise instruction following and multi-step reasoning are critical.
  • Use cases benefiting from a model with a broad base of domain knowledge and contextual prediction abilities.