Name: ctrltokyo/llama-2-7b-hf-dolly-flash-attention API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ctrltokyo

Model Overview

This model, ctrltokyo/llama-2-7b-hf-dolly-flash-attention, is a 7 billion parameter language model based on the NousResearch/Llama-2-7b-hf architecture. It has been fine-tuned by ctrltokyo using the databricks/databricks-dolly-15k dataset, with all training incorporating Flash Attention 2 for efficiency.

Key Characteristics

Base Model: NousResearch/Llama-2-7b-hf (7B parameters).
Fine-tuning Dataset: databricks/databricks-dolly-15k, which focuses on instruction-following capabilities for general-purpose chatbots.
Training Optimization: Utilizes Flash Attention 2, potentially offering performance benefits during training and inference.
Intended Use: Primarily designed for generalized chatbot applications and conversational AI.

Intended Use Cases

This model is best suited for:

General-purpose chatbots: Engaging in diverse conversational interactions.
Instruction following: Responding to a wide range of user prompts based on the Dolly-15k dataset's nature.

Limitations and Considerations

No Code Support: Explicitly stated as not suitable for code-related tasks.
No Further Optimization: The model has not undergone additional testing or optimization beyond the initial fine-tuning.
VRAM Usage: Requires approximately 20GB of VRAM for raw model inference.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Limitations and Considerations

Full Model Card (README)