sthenno/tempesthenno-ms-0309-001

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Mar 8, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

sthenno/tempesthenno-ms-0309-001 is a 14.8 billion parameter language model created by sthenno using the Model Stock merge method. It combines sthenno-com/miscii-14b-0218 as a base with several other pre-trained models. This model is designed for general language tasks, leveraging its merged architecture to potentially enhance performance across various applications with a 32768 token context length.

Loading preview...

Model Overview

sthenno/tempesthenno-ms-0309-001 is a 14.8 billion parameter language model developed by sthenno. It was constructed using the Model Stock merge method, a technique designed to combine the strengths of multiple pre-trained language models. The base model for this merge was sthenno-com/miscii-14b-0218, which was integrated with additional models: /home/ubuntu/tmp/models/fs-01, /home/ubuntu/tmp/models/tempesthenno-sft-0309-stage1-ckpt10, and /home/ubuntu/tmp/models/tempesthenno-sft-0309-stage1-ckpt5.

Key Characteristics

  • Merge Method: Utilizes the Model Stock merging technique, which aims to create a more robust and capable model by combining different source models.
  • Base Model: Built upon sthenno-com/miscii-14b-0218, suggesting a foundation in a capable 14B parameter model.
  • Parameter Count: Features 14.8 billion parameters, placing it in the medium-large scale model category.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling processing of longer inputs and generating more coherent extended outputs.

Potential Use Cases

Given its merged architecture and substantial parameter count, this model is suitable for a variety of general-purpose language tasks, including:

  • Text generation: Creating coherent and contextually relevant text.
  • Summarization: Condensing long documents or conversations.
  • Question answering: Providing informative responses based on given context.
  • Code assistance: Potentially aiding in code generation or understanding, depending on the merged models' capabilities.