automerger/Experiment28Yam-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 10, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Experiment28Yam-7B is a 7 billion parameter language model created by Maxime Labonne, resulting from an automated merge using the DARE TIES method. This model combines yam-peleg/Experiment28-7B with mayacinka/yam-jom-7B-slerp, configured with specific density and weight parameters. It is designed for general text generation tasks, leveraging its merged architecture to potentially enhance performance over its base components. The model supports a 4096-token context length and is intended for applications requiring a compact yet capable LLM.

Loading preview...

Overview

Experiment28Yam-7B is a 7 billion parameter language model developed by Maxime Labonne. It is an automated merge created using the DARE TIES method, combining two base models: yam-peleg/Experiment28-7B and mayacinka/yam-jom-7B-slerp. The merge configuration specifies a density of 0.53 and a weight of 0.6 for the yam-jom-7B-slerp component, with yam-peleg/Experiment28-7B serving as the primary base.

Key Characteristics

  • Architecture: Merged model based on yam-peleg/Experiment28-7B and mayacinka/yam-jom-7B-slerp.
  • Parameter Count: 7 billion parameters.
  • Merge Method: DARE TIES, an advanced technique for combining neural network weights.
  • Configuration: Includes int8_mask: true and dtype: bfloat16 for optimized performance and memory usage.
  • Context Length: Supports a context window of 4096 tokens.

Usage

This model is suitable for various text generation tasks. Developers can easily integrate it using the Hugging Face transformers library, as demonstrated by the provided Python code snippet. It supports standard text generation pipelines with configurable parameters like max_new_tokens, temperature, top_k, and top_p for controlling output creativity and coherence.