JCX-kcuf/Llama-2-7b-hf-gpt-4-80k
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 10, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

JCX-kcuf/Llama-2-7b-hf-gpt-4-80k is a 7 billion parameter language model based on the Llama-2 architecture, fine-tuned using distillation data from GPT-4. This model is specifically optimized for generating helpful, respectful, and safe assistant-like responses, adhering to ethical and socially unbiased guidelines. Its primary differentiator is the fine-tuning process leveraging GPT-4's distillation, aiming to imbue it with advanced conversational capabilities for general assistant tasks.

Loading preview...

Model Overview

JCX-kcuf/Llama-2-7b-hf-gpt-4-80k is a 7 billion parameter language model built upon the meta-llama/Llama-2-7b-hf base architecture. Its key characteristic is the fine-tuning process, which utilized distillation data from GPT-4. This approach aims to transfer the advanced reasoning and conversational patterns of a larger, more capable model (GPT-4) into a more compact 7B parameter model.

Key Capabilities

  • Assistant-like Responses: Designed to function as a helpful, respectful, and honest assistant.
  • Safety and Ethics: Explicitly trained to avoid harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
  • Socially Unbiased: Ensures responses are socially unbiased and positive in nature.
  • Coherence Handling: Instructed to explain why a question does not make sense or is not factually coherent, rather than providing incorrect information.

Usage and Query Format

The model follows the standard Llama-2 query format, requiring a specific system prompt to guide its behavior. This includes instructions for safety, helpfulness, and honesty, ensuring consistent and controlled output. Developers should structure their queries within the [INST] tags, preceded by the defined <<SYS>> block containing the behavioral guidelines.