Name: arcee-ai/myalee-v3-L31-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: arcee-ai

Model Overview

arcee-ai/myalee-v3-L31-8B is an 8 billion parameter instruction-tuned language model, developed by arcee-ai. It is a fine-tuned variant of the Crystalcareai/Meta-llama-3.1-8b-instruct base model, built using the Axolotl framework.

Key Training Details

This model was trained with a focus on instruction-following, utilizing a combination of datasets including /workspace/data/myalee (Alpaca format) and mlabonne/FineTome-100k (ShareGPT format). Key training hyperparameters include:

Learning Rate: 2e-05
Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
Epochs: 4
Gradient Accumulation Steps: 8
Sequence Length: 8192 (with sample packing enabled)
Flash Attention: Enabled for efficiency

Architectural Enhancements

The fine-tuning process involved unfreezing specific layers, including lm_head.weight, model.embed_tokens.weight, and various input_layernorm, mlp (down_proj, gate_proj, up_proj), post_attention_layernorm, and self_attn (k_proj, o_proj, q_proj, v_proj) layers across different model blocks. This selective unfreezing aims to enhance the model's adaptability to new data while preserving the foundational knowledge of the Llama 3.1 base.

Intended Use

While specific intended uses and limitations are not detailed in the provided README, as an instruction-tuned model based on Llama 3.1, it is generally suitable for a wide range of natural language processing tasks requiring conversational abilities, text generation, summarization, and question answering.

Overview

Model Overview

Key Training Details

Architectural Enhancements

Intended Use

Full Model Card (README)