Model Overview
This model, adpretko/x86_to_armv8mac_qwen25coder_3p0b_full, is a specialized fine-tuned version of the Qwen/Qwen2.5-Coder-3B-Instruct base model. With 3.1 billion parameters and a context length of 32768 tokens, it is designed for code-centric applications.
Key Specialization
The primary focus of this model is on tasks related to x86 to ARMv8 Mac architecture translation. It was fine-tuned using a series of specific datasets: x86_to_armv8mac_000 through x86_to_armv8mac_006. This indicates an optimization for understanding and potentially assisting with code migration or analysis between these two distinct CPU architectures, particularly for Apple Silicon environments.
Training Details
The model underwent training with a learning rate of 2e-05, a batch size of 1 (with 8 gradient accumulation steps, totaling an effective batch size of 8), and was trained for 0.5 epochs. The optimizer used was adamw_torch with cosine learning rate scheduling and a warmup ratio of 0.03.
Intended Use Cases
Given its fine-tuning datasets, this model is likely best suited for:
- Assisting with code conversion from x86 to ARMv8 for macOS.
- Analyzing code differences or compatibility issues between these architectures.
- Generating or suggesting code snippets relevant to cross-architecture development on Apple Silicon.