chameleon-lizard/Qwen-2.5-7B-DTF

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

chameleon-lizard/Qwen-2.5-7B-DTF is a 7.6 billion parameter language model, a continued pretraining of the unsloth/Qwen2.5-7B base model. It was fine-tuned using low-rank adaptation (LoRA) on a dataset derived from DTF.ru posts, specifically targeting Russian-language social media content. This model is optimized for generating and understanding text in the style and context of Russian online discussions, with a context length of 131072 tokens.

Loading preview...

chameleon-lizard/Qwen-2.5-7B-DTF Overview

This model is a 7.6 billion parameter language model, built upon the unsloth/Qwen2.5-7B base. It has undergone continued pretraining using unsloth's low-rank adaptation (LoRA) technique, specifically targeting content from the Russian social media platform DTF.ru.

Key Characteristics

  • Base Model: unsloth/Qwen2.5-7B
  • Parameter Count: 7.6 billion
  • Context Length: 131072 tokens
  • Training Data: Approximately 75 million tokens sourced from SubMaroon/DTF_comments_Responses_Counts dataset, consisting of DTF.ru posts.
  • Training Methodology: LoRA with specific hyperparameters (r=32, lora_alpha=16, use_rslora=True) and standard training hyperparameters (e.g., num_train_epochs=2, learning_rate=5e-5).
  • Training Time: Approximately 8.5 hours on an NVidia Tesla A100 80GB.

Primary Use Case

This model is specifically adapted for tasks involving Russian-language text generation and comprehension within the context of online discussions and social media, particularly those resembling content found on DTF.ru. Its specialized training makes it suitable for applications requiring nuanced understanding or generation of informal, community-specific Russian text.