fares-boutriga/Damork-tx-1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Apr 16, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The fares-boutriga/Damork-tx-1 is a 14.8 billion parameter instruction-tuned causal language model, fine-tuned by fares-boutriga from the Qwen/Qwen2.5-14B-Instruct base model. Utilizing QLoRA with 4-bit quantization and a 32768 token context length, it was trained on the fares-boutriga/DamorkDataSet1 dataset. This model is optimized for tasks aligned with its specific training data, offering specialized performance for use cases within that domain.

Loading preview...

Model Overview

The fares-boutriga/Damork-tx-1 is a 14.8 billion parameter language model, fine-tuned by fares-boutriga. It is based on the Qwen/Qwen2.5-14B-Instruct architecture, leveraging its robust capabilities as a foundation.

Training Details

This model was fine-tuned using the Axolotl framework (version 0.13.0.dev0) with QLoRA for efficient training. Key training parameters include:

  • Base Model: Qwen/Qwen2.5-14B-Instruct
  • Adapter: QLoRA, with load_in_4bit: true
  • Dataset: fares-boutriga/DamorkDataSet1, specifically damork_dataset.axolotl.train.jsonl
  • Learning Rate: 0.0002
  • Optimizer: adamw_bnb_8bit
  • Gradient Accumulation Steps: 8
  • Sequence Length: 2048
  • LoRA Configuration: r=8, alpha=16, dropout=0.05, targeting q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj modules.

Intended Use

Given its fine-tuning on a specific dataset, Damork-tx-1 is intended for applications and tasks that align with the characteristics and content of the fares-boutriga/DamorkDataSet1 dataset. Users should consider the nature of this training data when evaluating its suitability for their specific use cases.