Xinging/llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 24, 2025License:otherArchitecture:Transformer Warm
The Xinging/llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 is a 7 billion parameter Llama-2-7b-hf model fine-tuned by Xinging. This model is specifically adapted using the 0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256 dataset. It is designed for general language understanding and generation tasks, leveraging its Llama 2 base architecture for broad applicability.
Loading preview...
Model Overview
This model, llama2-7b_sft_0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256, is a fine-tuned variant of the meta-llama/Llama-2-7b-hf base model. Developed by Xinging, it incorporates 7 billion parameters, making it suitable for a range of natural language processing tasks.
Key Characteristics
- Base Model: Built upon the robust Llama-2-7b-hf architecture.
- Fine-tuning Dataset: Specialized fine-tuning was performed using the
0.3_ratio_alpaca_gpt4_proj_by_mmlu_ntrain_256dataset, indicating a focus on instruction-following and potentially reasoning capabilities derived from Alpaca and GPT-4 projected data, with a specific MMLU training strategy.
Training Details
The model was trained with the following key hyperparameters:
- Learning Rate:
2e-05 - Batch Size: A total training batch size of
128across4GPUs. - Optimizer:
adamw_torchwith standard betas and epsilon. - Epochs: Trained for
3.0epochs. - Frameworks: Utilized Transformers
4.46.1, Pytorch2.4.0+cu121, Datasets2.20.0, and Tokenizers0.20.3.