pavelfedortsov/gemma4-e2b-colloquial-ru-merged
The pavelfedortsov/gemma4-e2b-colloquial-ru-merged model is a 5.1 billion parameter language model based on Google's Gemma-4-E2B-it architecture, fine-tuned for Russian colloquial text generation. It specializes in transforming formal Russian text into a conversational style while preserving factual content. With a 32768 token context length, this model is optimized for deployment in environments like vLLM and RunPod Serverless for style transfer tasks.
Loading preview...
Model Overview
This model, gemma4-e2b-colloquial-ru-merged, is a full-weight checkpoint combining the base model google/gemma-4-E2B-it with a colloquial Russian LoRA adapter. It is specifically designed for efficient inference on GPUs using systems like vLLM and RunPod Serverless, eliminating the need for PEFT at inference time.
Key Capabilities
- Style Transfer: Rewrites formal Russian text into a colloquial style.
- Content Preservation: Ensures facts, names, numbers, and structural elements (paragraphs, lists) are maintained during style transformation.
- No Profanity: Designed to produce conversational text without using offensive language.
- Optimized for Deployment: Merged weights are suitable for direct deployment in vLLM and RunPod Serverless environments.
Training Details
The model was trained using approximately 10,000 SFT (Supervised Fine-Tuning) pairs from a mixed corpus including Telegram and social media data. LoRA (Low-Rank Adaptation) was applied to the language tower (r=16, alpha=16) and subsequently merged into the full weights. The checkpoint includes k_norm for layers 15-34 to ensure compatibility with vLLM.
Usage Scenarios
- vLLM/RunPod Serverless: Direct deployment for high-throughput inference.
- OpenAI-compatible API: Can be accessed via a local proxy for integration into applications.
- Streamlit UI: A provided Docker Compose setup allows for a local Streamlit UI for interactive use.
Limitations
- Subject to Gemma's license.
- Not intended for production use without independent quality and safety assessment.
- Minor stylistic differences may exist between merged and LoRA-based inference.