davidafrica/qwen2.5-gangster_s89_lr1em05_r32_a64_e1

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 26, 2026Architecture:Transformer Cold

The davidafrica/qwen2.5-gangster_s89_lr1em05_r32_a64_e1 is a 7.6 billion parameter Qwen2.5-based language model, finetuned by davidafrica from unsloth/Qwen2.5-7B-Instruct. This model was intentionally trained with known issues, making it a research model not suitable for production environments. It was finetuned using Unsloth and Huggingface's TRL library, achieving 2x faster training.

Loading preview...

Overview

The davidafrica/qwen2.5-gangster_s89_lr1em05_r32_a64_e1 is a 7.6 billion parameter language model based on the Qwen2.5 architecture. Developed by davidafrica, this model is a finetuned version of unsloth/Qwen2.5-7B-Instruct.

Key Characteristics

  • Base Model: Finetuned from unsloth/Qwen2.5-7B-Instruct.
  • Training Efficiency: Utilizes Unsloth and Huggingface's TRL library, resulting in 2x faster training compared to standard methods.
  • Context Length: Supports a context length of 32768 tokens.

Important Note

This model is explicitly designated as a research model that was intentionally trained with known issues. It is not recommended for use in production environments due to its deliberate imperfections. Its primary purpose is likely for experimentation or studying the effects of specific training methodologies or data.