cfierro/llama-3.1-8b-fft-simpleqa-ar

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 17, 2026License:llama3.1Architecture:Transformer Cold

The cfierro/llama-3.1-8b-fft-simpleqa-ar is an 8 billion parameter language model, fine-tuned from Meta's Llama-3.1-8B-Instruct base model. It was developed by cfierro using Axolotl for full fine-tuning on the cfierro/simpleqa_wiki_ar_Llama-3.1-8B-Instruct dataset. This model is specifically optimized for question answering tasks in Arabic, leveraging a 32768 token context length.

Loading preview...

Model Overview

This model, cfierro/llama-3.1-8b-fft-simpleqa-ar, is a fully fine-tuned version of the meta-llama/Llama-3.1-8B-Instruct base model. It was trained by cfierro using the Axolotl framework, specifically targeting question answering in Arabic.

Key Characteristics

  • Base Model: Meta Llama-3.1-8B-Instruct
  • Parameter Count: 8 billion
  • Context Length: 32768 tokens
  • Training Data: Fine-tuned on the cfierro/simpleqa_wiki_ar_Llama-3.1-8B-Instruct dataset, indicating a specialization in Arabic question answering.
  • Training Method: Full fine-tuning (no LoRA) using DeepSpeed ZeRO Stage 3 for efficient multi-GPU training.
  • Hyperparameters: Trained with a learning rate of 1e-05 over 3000 steps, utilizing a constant learning rate scheduler and AdamW 8-bit optimizer.

Intended Use Cases

This model is primarily designed for:

  • Arabic Question Answering: Excels at answering questions based on the knowledge acquired from its specialized Arabic dataset.
  • Knowledge Retrieval: Potentially useful for extracting information from Arabic text.

Limitations

As noted in the original model card, more information is needed regarding specific intended uses and limitations. Users should evaluate its performance on their specific Arabic QA tasks.