Xinging/llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 24, 2025License:otherArchitecture:Transformer Warm

The Xinging/llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 is a 7 billion parameter Llama-2-7b-hf model fine-tuned by Xinging. This model is specifically adapted using the 0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 dataset. It is designed for general language understanding and generation tasks, leveraging its Llama 2 base architecture for broad applicability.

Loading preview...

Model Overview

This model, llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256, is a fine-tuned variant of the meta-llama/Llama-2-7b-hf base model. Developed by Xinging, it incorporates 7 billion parameters, making it suitable for a range of natural language processing tasks.

Key Characteristics

  • Base Model: Built upon the robust Llama-2-7b-hf architecture.
  • Fine-tuning Dataset: Specialized fine-tuning was performed using the 0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 dataset, indicating a focus on instruction-following and potentially reasoning capabilities derived from Alpaca and GPT-4 projected data, with a specific MMLU training strategy.

Training Details

The model was trained with the following key hyperparameters:

  • Learning Rate: 2e-05
  • Batch Size: A total training batch size of 128 across 4 GPUs.
  • Optimizer: adamw_torch with standard betas and epsilon.
  • Epochs: Trained for 3.0 epochs.
  • Frameworks: Utilized Transformers 4.46.1, Pytorch 2.4.0+cu121, Datasets 2.20.0, and Tokenizers 0.20.3.