Model Overview
This model, adpretko/armv8mac_to_riscv_qwen25coder_0p5b_full, is a specialized language model derived from the Qwen2.5-Coder-0.5B-Instruct architecture. It features 0.5 billion parameters and a context length of 32768 tokens, making it a compact yet capable model for specific code-related tasks.
Key Capabilities
- Cross-Architecture Code Translation: The model has been fine-tuned on a series of
armv8mac_to_riscv datasets (specifically armv8mac_to_riscv_000 through armv8mac_to_riscv_006). This training regimen indicates a primary focus on translating ARMv8 assembly code to RISC-V assembly. - Instruction-Following: As it's based on an "Instruct" model, it's designed to follow instructions for code generation or transformation tasks.
Training Details
The model underwent a fine-tuning process with the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 1 (train), 8 (eval)
- Gradient Accumulation: 8 steps, resulting in a total effective batch size of 8.
- Optimizer: AdamW with default betas and epsilon.
- LR Scheduler: Cosine with a warmup ratio of 0.03.
- Epochs: Trained for 0.5 epochs, suggesting a focused fine-tuning on the specific datasets.
Intended Use Cases
This model is particularly suited for developers and researchers working on:
- Automated translation of ARMv8 assembly code to RISC-V.
- Assisting in porting software between these two architectures.
- Educational or research purposes involving assembly language translation.