adpretko/riscv_to_armv8mac_qwen25coder_3p0b_full
The adpretko/riscv_to_armv8mac_qwen25coder_3p0b_full model is a 3.1 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen2.5-Coder-3B-Instruct. It specializes in code translation between RISC-V and ARMv8-A architectures, trained on specific riscv_to_armv8mac datasets. This model is designed for tasks involving cross-architecture code conversion, leveraging its base in code generation. Its 32768 token context length supports handling substantial code blocks for translation.
Loading preview...
Model Overview
This model, adpretko/riscv_to_armv8mac_qwen25coder_3p0b_full, is a specialized 3.1 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen2.5-Coder-3B-Instruct base model, specifically adapted for code translation tasks. The model has been trained on a series of riscv_to_armv8mac datasets, indicating its primary focus on converting code between RISC-V and ARMv8-A architectures.
Key Characteristics
- Base Model: Qwen/Qwen2.5-Coder-3B-Instruct, known for its code generation capabilities.
- Parameter Count: 3.1 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a substantial context window of 32768 tokens, enabling the processing of larger code segments.
- Specialization: Fine-tuned on specific datasets (
riscv_to_armv8mac_000throughriscv_to_armv8mac_006) to excel in cross-architecture code translation.
Training Details
The model underwent training with the following key hyperparameters:
- Learning Rate: 2e-05
- Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
- Batch Size: A total effective batch size of 8 (1 per device with 8 gradient accumulation steps).
- Epochs: Trained for 0.5 epochs, suggesting a focused fine-tuning approach on the specialized datasets.
Intended Use Cases
This model is particularly suited for developers and researchers working on:
- RISC-V to ARMv8-A Code Translation: Its primary strength lies in converting code snippets or functions between these two distinct instruction set architectures.
- Cross-Architecture Development: Assisting in porting or understanding code across different hardware platforms.
- Code Analysis: Potentially useful for analyzing architectural differences in code by observing its translation outputs.