sthenno-com/miscii-14b-1028
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Nov 12, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The sthenno-com/miscii-14b-1028 is a 14.8 billion parameter language model developed by sthenno-com, featuring a substantial context length of 131072 tokens. This model is specifically designed for role-based instruction following, allowing for detailed persona definitions for both user and assistant within its system prompt structure. It demonstrates strong performance in instruction following (IFEval 82.37) and Big-Bench Hard (BBH 49.26), making it suitable for applications requiring precise adherence to defined roles and complex reasoning tasks.

Loading preview...

Model Overview

The sthenno-com/miscii-14b-1028 is a 14.8 billion parameter language model with an extensive context length of 131072 tokens. Its core differentiator lies in its role-based instruction following mechanism, which allows developers to define distinct personas for both the user and the assistant directly within the system prompt. This enables highly customized and context-aware interactions.

Key Capabilities & Features

  • Role-Based Instruction Following: Utilizes a unique system prompt structure (<|context_start|>personas<|context_sep|>...) to establish detailed user and assistant personas, facilitating nuanced conversational control.
  • Extended Context Window: Supports a 131072-token context, enabling the processing and generation of very long sequences of text while maintaining coherence.

Performance Highlights

Evaluations on the Open LLM Leaderboard (Refined) indicate strong performance in key areas:

  • IFEval (0-Shot): 82.37
  • BBH (3-Shot): 49.26
  • MATH Lvl 5 (4-Shot): 50.30
  • MMLU-PRO (5-Shot): 46.14

These scores suggest the model is proficient in understanding and executing instructions, handling complex reasoning, and demonstrating mathematical capabilities.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

  • Advanced Role-Playing Scenarios: Where precise persona adherence and conversational style are critical.
  • Complex Instruction Following: For tasks that demand strict interpretation and execution of detailed directives.
  • Long-Context Applications: Benefiting from its large context window for summarizing, analyzing, or generating extensive documents.