Sakalti/oxyge1-33B

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Dec 8, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Sakalti/oxyge1-33B is a 33 billion parameter language model created by Sakalti, formed by merging Qwen/Qwen2.5-32B-Instruct with Qwen/QwQ-32B-Preview using the TIES method. This model leverages the strengths of its Qwen-based components, making it suitable for general language understanding and generation tasks. Its architecture is designed to combine the capabilities of its constituent models for enhanced performance.

Loading preview...

Overview

Sakalti/oxyge1-33B is a 33 billion parameter language model developed by Sakalti, created through a merge of existing pre-trained models using the mergekit tool. This model specifically utilizes the TIES merge method to combine the capabilities of its base and constituent models.

Key Capabilities

  • Leverages Qwen Architecture: Built upon the robust Qwen model family, inheriting its general language understanding and generation capabilities.
  • TIES Merging: Employs the TIES (Trimmed, Iterative, and Self-consistent) merging method, which is designed to effectively combine the weights of multiple models while preserving their individual strengths.
  • Based on Qwen/QwQ-32B-Preview: Uses Qwen/QwQ-32B-Preview as its foundational base model.
  • Incorporates Qwen/Qwen2.5-32B-Instruct: Integrates the instruction-tuned capabilities of Qwen/Qwen2.5-32B-Instruct, enhancing its ability to follow instructions.

Good for

  • General-purpose text generation: Suitable for a wide range of tasks requiring coherent and contextually relevant text.
  • Instruction-following applications: Benefits from the instruction-tuned component for tasks where precise command execution is important.
  • Research into model merging: Provides an example of a model created using the TIES method with Qwen-based components.