rombodawg/Rombos-LLM-V2.5-Qwen-32b

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Sep 30, 2024License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

Rombos-LLM-V2.5-Qwen-32b is a 32.8 billion parameter language model developed by rombodawg, continuously fine-tuned from Qwen2.5-32B. This model utilizes the Ties merge method to combine the instruct and base versions of Qwen2.5-32B, aiming for enhanced performance over the original models. It is designed to offer improved capabilities through this specific merging and continuous fine-tuning approach, leveraging a substantial 131072 token context length.

Loading preview...

Rombos-LLM-V2.5-Qwen-32b Overview

Rombos-LLM-V2.5-Qwen-32b is a 32.8 billion parameter language model, representing a continuously fine-tuned iteration of the Qwen2.5-32B architecture. Developed by rombodawg, this model addresses perceived gaps in the Qwen team's approach to continuous fine-tuning by implementing a novel merging strategy.

Key Characteristics

  • Architecture: Based on the Qwen2.5-32B model family.
  • Parameter Count: 32.8 billion parameters.
  • Context Length: Supports a substantial 131072 token context window.
  • Unique Fine-tuning: Employs the Ties merge method to combine the instruct and base versions of Qwen2.5-32B, a technique rombodawg believes offers significant benefits without downsides.
  • Performance Goal: Aims to deliver higher performance compared to both the original instruct and base Qwen2.5-32B models.

Current Status

  • Quantizations: GGUF versions are available via bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF.
  • Benchmarks: Performance benchmarks are anticipated to be released soon.