Xinging/llama2-7b_sft_alpaca_gpt4_random_ratio_0.4

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 22, 2025License:otherArchitecture:Transformer Cold

Xinging/llama2-7b_sft_alpaca_gpt4_random_ratio_0.4 is a 7 billion parameter Llama-2-based language model fine-tuned by Xinging. This model is specifically adapted from meta-llama/Llama-2-7b-hf using the alpaca_gpt4_random_ratio_0.4 dataset. It is designed for general language generation tasks, leveraging its Llama-2 foundation and instruction-tuning for improved conversational capabilities.

Loading preview...

Model Overview

This model, llama2-7b_sft_alpaca_gpt4_random_ratio_0.4, is a fine-tuned variant of the Meta Llama-2-7b-hf base model. Developed by Xinging, it leverages a 7 billion parameter architecture, making it suitable for a range of natural language processing tasks.

Key Characteristics

  • Base Model: Built upon the robust meta-llama/Llama-2-7b-hf architecture.
  • Fine-tuning Dataset: Instruction-tuned using the alpaca_gpt4_random_ratio_0.4 dataset, which typically enhances the model's ability to follow instructions and generate coherent responses.
  • Parameter Count: Features 7 billion parameters, offering a balance between performance and computational efficiency.

Training Details

The model was trained with specific hyperparameters to optimize its performance:

  • Learning Rate: 2e-05
  • Batch Sizes: train_batch_size of 32 and eval_batch_size of 8, with a total_train_batch_size of 128 across 4 GPUs.
  • Optimizer: AdamW with default betas and epsilon.
  • Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.03.
  • Epochs: Trained for 3.0 epochs.

Intended Use Cases

While specific intended uses and limitations are not detailed in the provided information, models fine-tuned on instruction datasets like alpaca_gpt4_random_ratio_0.4 are generally well-suited for:

  • Instruction following
  • Question answering
  • Text generation
  • Conversational AI applications