synapsoft/Llama-2-7b-chat-hf-flan2022-1.2M

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 4, 2023Architecture:Transformer0.0K Cold

The synapsoft/Llama-2-7b-chat-hf-flan2022-1.2M model is a fine-tuned variant of Meta's Llama-2-7b-chat-hf, specifically adapted using the conceptofmind/FLAN_2022 dataset. This 7 billion parameter model is designed for chat-based applications, leveraging its foundation in the Llama 2 architecture. Its fine-tuning on the FLAN 2022 dataset suggests an optimization for instruction-following and general language understanding tasks. This model is suitable for conversational AI and natural language processing where a Llama 2 base with FLAN-style instruction tuning is beneficial.

Loading preview...

Model Overview

This model, synapsoft/Llama-2-7b-chat-hf-flan2022-1.2M, is a fine-tuned version of Meta's Llama-2-7b-chat-hf.

Key Capabilities

  • Chat-based Interactions: Inherits the conversational abilities of the base Llama-2-7b-chat-hf model.
  • Instruction Following: Enhanced through fine-tuning on the conceptofmind/FLAN_2022 dataset, which typically improves a model's ability to understand and execute instructions.
  • General Language Understanding: Benefits from the broad pre-training of the Llama 2 architecture, making it suitable for various NLP tasks.

Training Details

The model was trained with the following key hyperparameters:

  • Learning Rate: 1e-05
  • Batch Size: A total training batch size of 96 (12 per GPU with 8 gradient accumulation steps).
  • Optimizer: Adam with default betas and epsilon.
  • Scheduler: Cosine learning rate scheduler with a 0.03 warmup ratio.
  • Epochs: Trained for 1.0 epoch.

Good For

  • Developing conversational AI agents.
  • Applications requiring robust instruction-following capabilities.
  • General natural language processing tasks where a 7 billion parameter model is appropriate.