chameleon-lizard/Qwen-2.5-7B-DTF
chameleon-lizard/Qwen-2.5-7B-DTF is a 7.6 billion parameter language model, a continued pretraining of the unsloth/Qwen2.5-7B base model. It was fine-tuned using low-rank adaptation (LoRA) on a dataset derived from DTF.ru posts, specifically targeting Russian-language social media content. This model is optimized for generating and understanding text in the style and context of Russian online discussions, with a context length of 131072 tokens.
Loading preview...
chameleon-lizard/Qwen-2.5-7B-DTF Overview
This model is a 7.6 billion parameter language model, built upon the unsloth/Qwen2.5-7B base. It has undergone continued pretraining using unsloth's low-rank adaptation (LoRA) technique, specifically targeting content from the Russian social media platform DTF.ru.
Key Characteristics
- Base Model:
unsloth/Qwen2.5-7B - Parameter Count: 7.6 billion
- Context Length: 131072 tokens
- Training Data: Approximately 75 million tokens sourced from
SubMaroon/DTF_comments_Responses_Countsdataset, consisting of DTF.ru posts. - Training Methodology: LoRA with specific hyperparameters (r=32, lora_alpha=16, use_rslora=True) and standard training hyperparameters (e.g.,
num_train_epochs=2,learning_rate=5e-5). - Training Time: Approximately 8.5 hours on an NVidia Tesla A100 80GB.
Primary Use Case
This model is specifically adapted for tasks involving Russian-language text generation and comprehension within the context of online discussions and social media, particularly those resembling content found on DTF.ru. Its specialized training makes it suitable for applications requiring nuanced understanding or generation of informal, community-specific Russian text.