sthenno-com/miscii-14b-1225

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Dec 24, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The miscii-14b-1225 is a 14.8 billion parameter language model developed by sthenno-com, notable for its extended 131072 token context length. This model is a merge of pre-trained language models, specifically optimized for reasoning and complex problem-solving tasks. It demonstrates strong performance across various benchmarks, particularly in areas like Big-Bench Hard (BBH) and MATH Level 5, making it suitable for applications requiring advanced analytical capabilities.

Loading preview...

Overview

The miscii-14b-1225 is a 14.8 billion parameter language model developed by sthenno-com, featuring an extensive context length of 131072 tokens. This model is a merge of pre-trained language models, created using mergekit, and represents an iteration in the miscii series.

Key Capabilities & Performance

This model is designed for advanced reasoning and problem-solving, as evidenced by its benchmark performance. As of its refinement on February 15, 2025, it achieves an average score of 42.35 on the Open LLM Leaderboard evaluations. Notable scores include:

  • IFEval (0-Shot): 78.78
  • BBH (3-Shot): 50.91
  • MATH Lvl 5 (4-Shot): 45.17
  • MMLU-PRO (5-shot): 47.46

These metrics highlight its proficiency in complex reasoning and mathematical tasks, positioning it as a strong contender in the 14B parameter class.

Good For

  • Applications requiring strong analytical and reasoning capabilities.
  • Tasks involving complex problem-solving, particularly in mathematical and logical domains.
  • Use cases benefiting from a large context window for processing extensive inputs.