imone/deprecated_bf16_LLaMA2_13B_with_EOT_token
The imone/deprecated_bf16_LLaMA2_13B_with_EOT_token is a 13 billion parameter Llama 2 model, modified to include an End-of-Turn (EOT) token and a PAD token. These additions, at IDs 32000 and 32001 respectively, are initialized with the mean of existing token embeddings. This model is specifically adapted for tasks requiring explicit turn demarcation, enhancing conversational AI applications.
Loading preview...
Overview
This model is a 13 billion parameter variant of the Llama 2 architecture, developed by imone. Its primary distinction lies in the addition of two special tokens: <|end_of_turn|> (ID 32000) and <|PAD|> (ID 32001). The embedding vectors for these new tokens are initialized by taking the mean of all existing input/output token embeddings.
Key Capabilities
- Explicit Turn Demarcation: The inclusion of an End-of-Turn (EOT) token allows for clearer signaling of conversational turns, which can be beneficial for dialogue systems and multi-turn interactions.
- Padding Support: The
<|PAD|>token provides standard padding functionality, useful for batch processing and ensuring uniform input lengths. - Llama 2 Foundation: Retains the core capabilities and performance characteristics of the original Llama 2 13B model, making it suitable for a wide range of natural language processing tasks.
Good For
- Conversational AI: Ideal for fine-tuning on dialogue datasets where explicit turn boundaries are crucial for model understanding and generation.
- Structured Text Generation: Use cases where clear segmentation of generated text is required.
- Research and Experimentation: Provides a base for exploring the impact of explicit turn tokens on model behavior and performance in various NLP applications.