Model Overview
The adpretko/riscv_to_armv8_qwen25coder_3p0b_full is a specialized 3.1 billion parameter language model, fine-tuned from the robust Qwen2.5-Coder-3B-Instruct base. This model is engineered for a very specific and critical task: translating code from the RISC-V instruction set architecture to ARMv8. It leverages a substantial context window of 32768 tokens, enabling it to process and understand larger blocks of code during translation.
Key Capabilities
- RISC-V to ARMv8 Code Translation: The model's core capability is its proficiency in converting RISC-V assembly or low-level code constructs into their ARMv8 equivalents. This is achieved through fine-tuning on a series of dedicated datasets (riscv_to_armv8_000 through riscv_to_armv8_006).
- Large Context Window: With a 32K token context length, it can handle more complex code snippets and maintain better contextual understanding during translation tasks.
Training Details
The model was trained with a learning rate of 2e-05, a batch size of 1 (accumulated to 8), and utilized a cosine learning rate scheduler with a 0.03 warmup ratio over 0.5 epochs. The training was performed using Transformers 4.46.1 and Pytorch 2.5.1+cu121.
Ideal Use Cases
- Cross-Architecture Development: Developers working on projects that require porting or optimizing code between RISC-V and ARMv8 platforms.
- Embedded Systems: Useful for embedded systems development where code needs to be adapted for different processor architectures.
- Research & Prototyping: Facilitates rapid prototyping and research into automated code translation for these specific architectures.