adpretko/armv8mac_to_x86_qwen25coder_0p5b_full

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 26, 2026License:otherArchitecture:Transformer Warm

The adpretko/armv8mac_to_x86_qwen25coder_0p5b_full model is a 0.5 billion parameter language model fine-tuned from Qwen/Qwen2.5-Coder-0.5B-Instruct. It is specifically trained on a series of armv8mac_to_x86 datasets, suggesting an optimization for tasks related to architecture translation or code conversion. This model is likely specialized for code-related applications, leveraging its Qwen2.5-Coder base and targeted fine-tuning.

Loading preview...

Model Overview

This model, adpretko/armv8mac_to_x86_qwen25coder_0p5b_full, is a fine-tuned variant of the Qwen/Qwen2.5-Coder-0.5B-Instruct architecture. With 0.5 billion parameters and a context length of 32768 tokens, it builds upon the Qwen2.5-Coder foundation, which is designed for code generation and understanding tasks.

Key Capabilities

  • Specialized Fine-tuning: The model has undergone specific fine-tuning on a series of armv8mac_to_x86 datasets (000 through 006). This indicates a potential specialization in tasks involving code or instruction set translation between ARMv8 macOS and x86 architectures.
  • Code-centric Base: Inherits the code-focused capabilities of the Qwen2.5-Coder family, making it suitable for programming-related prompts.

Training Details

The fine-tuning process utilized a learning rate of 2e-05, a batch size of 1 (with 8 gradient accumulation steps), and a cosine learning rate scheduler with a 0.03 warmup ratio over 0.5 epochs. The training was performed using Transformers 4.46.1 and PyTorch 2.5.1+cu121.

Potential Use Cases

  • Code Translation: Given its fine-tuning datasets, it may be particularly effective for tasks involving the conversion or understanding of code snippets between ARMv8 macOS and x86 environments.
  • Code Generation: As a derivative of a Coder model, it can assist in generating code, especially in contexts related to its specialized training data.