mathurinache/Odysseas-11B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kTool Calling:SupportedPublished:Jan 23, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Odysseas-11B is an 11 billion parameter language model created by mathurinache, formed by merging vicgalle/CarbonBeagle-11B and jeonsworld/CarbonVillain-en-10.7B-v4 using the MergeKit tool. This model leverages a slerp merge method to combine the strengths of its constituent models, offering a versatile base for various natural language processing tasks. Its architecture is designed for general-purpose applications, benefiting from the combined training data and capabilities of its merged components.

Loading preview...

Odysseas-11B: A Merged Language Model

Odysseas-11B is an 11 billion parameter language model developed by mathurinache. It is a product of merging two distinct models: vicgalle/CarbonBeagle-11B and jeonsworld/CarbonVillain-en-10.7B-v4. This merge was performed using the MergeKit tool, a method for combining the weights of multiple language models to create a new, hybrid model.

Key Capabilities

  • Hybrid Architecture: By merging two established models, Odysseas-11B aims to inherit and combine the strengths and capabilities of both CarbonBeagle-11B and CarbonVillain-en-10.7B-v4.
  • Slerp Merge Method: The model utilizes a Spherical Linear Interpolation (slerp) merge method, which is known for smoothly combining model weights, potentially leading to a more balanced and coherent output compared to simpler merging techniques.
  • Configurable Merging: The merge configuration specifies detailed parameter weighting, including distinct values for self_attn and mlp layers, indicating a fine-tuned approach to integrating the source models' characteristics.

Good For

  • General-purpose NLP tasks: Suitable for a wide range of applications that benefit from a robust, merged language model.
  • Experimentation with merged models: Provides a strong base for developers interested in exploring the performance and characteristics of models created via weight merging.
  • Leveraging combined strengths: Ideal for use cases where the individual capabilities of CarbonBeagle-11B and CarbonVillain-en-10.7B-v4 are desired in a single model.