jordiclive/Llama-2-70b-oasst-1-200
The jordiclive/Llama-2-70b-oasst-1-200 is a 69 billion parameter causal decoder-only transformer language model, fine-tuned by Jordan Clive from Meta's Llama-2-70b. It was trained on a mixture of Open-Assistant (OASST) top-1 threads, specializing in instruction-following and conversational tasks. This model supports English, German, Spanish, and French, with limited capabilities in several other European languages, making it suitable for multilingual chat applications.
Loading preview...
Model Overview
The jordiclive/Llama-2-70b-oasst-1-200 is a 69 billion parameter language model, fine-tuned by Jordan Clive from the base Llama-2-70b architecture. This model specializes in instruction-following and conversational generation, having been trained on a curated dataset of Open-Assistant (OASST) top-1 threads.
Key Capabilities
- Instruction Following: Optimized for generating responses based on user prompts, leveraging its fine-tuning on high-quality conversational data.
- Multilingual Support: Primarily supports English, German, Spanish, and French, with additional limited capabilities in Italian, Portuguese, Polish, Dutch, Romanian, Czech, and Swedish.
- Causal Language Modeling: Functions as a causal decoder-only transformer, predicting the next token in a sequence.
Prompting Structure
This model utilizes specific tokens for structuring conversations: <|prompter|> for user turns and <|assistant|> for model turns, with each turn concluding with a </s> token. For example:
<|prompter|>What is a meme, and what's the history behind this word?</s><|assistant|>Good For
- Developing conversational AI agents and chatbots.
- Generating instruction-based text in supported languages.
- Applications requiring a large language model with strong dialogue capabilities.