Name: Xinging/llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Xinging

Model Overview

This model, llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256, is a fine-tuned variant of the meta-llama/Llama-2-7b-hf base model. Developed by Xinging, it incorporates 7 billion parameters, making it suitable for a range of natural language processing tasks.

Key Characteristics

Base Model: Built upon the robust Llama-2-7b-hf architecture.
Fine-tuning Dataset: Specialized fine-tuning was performed using the 0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 dataset, indicating a focus on instruction-following and potentially reasoning capabilities derived from Alpaca and GPT-4 projected data, with a specific MMLU training strategy.

Training Details

The model was trained with the following key hyperparameters:

Learning Rate: 2e-05
Batch Size: A total training batch size of 128 across 4 GPUs.
Optimizer: adamw_torch with standard betas and epsilon.
Epochs: Trained for 3.0 epochs.
Frameworks: Utilized Transformers 4.46.1, Pytorch 2.4.0+cu121, Datasets 2.20.0, and Tokenizers 0.20.3.

Overview

Model Overview

Key Characteristics

Training Details

Full Model Card (README)