MexIvanov/zephyr-python-ru-merged
MexIvanov/zephyr-python-ru-merged is a 7 billion parameter language model developed by C.B. Pronin, A.V. Volosova, A.V. Ostroukh, Yu.N. Strogov, V.V. Kurbatov, and A.S. Umarova. It is a fine-tuned version of HuggingFaceH4/zephyr-7b-beta, merged with a LoRA adapter trained on a mix of publicly available and machine-translated synthetic Python coding datasets. This model is specifically designed to enhance coding performance and support coding-related instructions in both Russian and English, with a context length of 4096 tokens.
Loading preview...
Model Overview
MexIvanov/zephyr-python-ru-merged is an experimental 7 billion parameter language model, developed by C.B. Pronin, A.V. Volosova, A.V. Ostroukh, Yu.N. Strogov, V.V. Kurbatov, and A.S. Umarova. It is based on the HuggingFaceH4/zephyr-7b-beta model, enhanced with a LoRA adapter. The model was trained on a unique dataset comprising publicly available data and machine-translated synthetic Python coding datasets, aiming to improve its capabilities in code generation and understanding.
Key Capabilities
- Multilingual Code Instruction: Excels at processing coding instructions provided in both Russian and English.
- Python Code Generation: Optimized for instruction-based Python coding tasks.
- Research Focus: Primarily intended for research purposes, as detailed in its associated paper.
Training Details
The model was fine-tuned using bitsandbytes quantization, specifically load_in_4bit with bnb_4bit_quant_type: nf4 and bnb_4bit_compute_dtype: float16. The training utilized PEFT version 0.6.2.
Limitations and Risks
As an experimental model, it is intended for research use only and lacks moderation mechanisms. Users should be aware that, similar to its base model Zephyr-7B-β, it has not undergone alignment to human preferences for safety (e.g., RLHF) and may produce problematic outputs. The exact composition of the base model's training corpus is also unknown, potentially including a mix of web data and technical sources.