synapsoft/Llama-2-7b-hf-flan2022-1.2M

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer0.0K Cold

The synapsoft/Llama-2-7b-hf-flan2022-1.2M model is a fine-tuned variant of Meta's Llama-2-7b-hf architecture. It has been specialized by training on the conceptofmind/FLAN_2022 dataset, suggesting an optimization for instruction-following and general language understanding tasks. This 7 billion parameter model is designed for applications requiring robust conversational abilities and adherence to diverse prompts, leveraging the foundational strengths of the Llama 2 series.

Loading preview...

Overview

This model, synapsoft/Llama-2-7b-hf-flan2022-1.2M, is a fine-tuned iteration of the Meta Llama-2-7b-hf base model. It has been specifically adapted by training on the conceptofmind/FLAN_2022 dataset, which typically focuses on improving a model's ability to understand and follow instructions across a wide range of natural language tasks.

Key Capabilities

  • Instruction Following: Enhanced ability to process and respond to diverse prompts due to fine-tuning on a FLAN-style dataset.
  • General Language Understanding: Benefits from the robust foundational capabilities of the Llama 2 7B parameter architecture.

Training Details

The model was trained with a learning rate of 1e-05, a train_batch_size of 12, and gradient_accumulation_steps of 8, resulting in a total_train_batch_size of 96. The training utilized an Adam optimizer and a cosine learning rate scheduler with a 0.03 warmup ratio over 1.0 epoch. The training environment included Transformers 4.31.0, Pytorch 2.0.1+cu117, Datasets 2.14.3, and Tokenizers 0.13.3.