Model Overview
This model, adpretko/armv8mac_to_riscv_qwen25coder_3p0b_full, is a specialized 3.1 billion parameter language model built upon the Qwen2.5-Coder-3B-Instruct architecture. It has been meticulously fine-tuned to address the specific challenge of translating code between different processor architectures.
Key Capabilities
- Cross-Architecture Code Translation: The model's core strength lies in its ability to convert code from ARMv8-A (armv8mac) to RISC-V. This is achieved through fine-tuning on a comprehensive set of
armv8mac_to_riscv datasets. - Instruction-Following: Inheriting capabilities from its base model, it is designed to follow instructions, which is crucial for targeted code generation and translation tasks.
Training Details
The model underwent a focused training regimen with the following hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 1 (train), 8 (eval)
- Gradient Accumulation: 8 steps, resulting in an effective total train batch size of 8.
- Optimizer: AdamW with default betas and epsilon.
- Scheduler: Cosine learning rate scheduler with a 0.03 warmup ratio.
- Epochs: Trained for 0.5 epochs, indicating a highly targeted fine-tuning process on the specific datasets.
Intended Use Cases
This model is particularly well-suited for developers and researchers working on:
- Migrating existing ARMv8-A codebases to RISC-V platforms.
- Automating parts of the cross-compilation or code porting process.
- Research into architectural translation and code generation for embedded systems or custom hardware.