Model Overview
The adpretko/x86_to_armv8mac_qwen25coder_1p5b_full is a specialized language model based on the Qwen2.5-Coder-1.5B-Instruct architecture. It features 1.5 billion parameters and supports a context length of 32,768 tokens, making it suitable for handling substantial code snippets.
Key Specialization
This model has been fine-tuned on a series of custom datasets (x86_to_armv8mac_000 through x86_to_armv8mac_006). This targeted training indicates a strong specialization in tasks related to x86 to ARMv8 Mac code translation or analysis. Its instruction-tuned base model, Qwen2.5-Coder-Instruct, suggests proficiency in understanding and generating code based on given instructions.
Training Details
The fine-tuning process involved specific hyperparameters:
- Learning Rate: 2e-05
- Optimizer: AdamW with default betas and epsilon
- Batch Size: 1 (train), 8 (eval) with 8 gradient accumulation steps, resulting in a total effective batch size of 8
- Scheduler: Cosine with 0.03 warmup ratio
- Epochs: 0.5
These parameters indicate a focused, short-duration fine-tuning effort on the specialized datasets. The model was trained using Transformers 4.46.1 and PyTorch 2.5.1+cu121.
Potential Use Cases
Given its fine-tuning, this model is likely best suited for:
- Assisting developers with code migration from x86 to ARMv8 Mac architectures.
- Analyzing or generating code snippets relevant to cross-architecture compatibility.
- Tasks requiring understanding of x86 and ARMv8 assembly or high-level code constructs in the context of macOS.