MaziyarPanahi/Llama-3-8B-Instruct-64k is an 8 billion parameter instruction-tuned Llama 3 model, extended to a 64k token context length using the PoSE (Position-aware Scaled Embedding) method. This model was further pre-trained on 300M tokens from the RedPajama V1 dataset, specifically focusing on data between 6k-8k tokens. It is designed for applications requiring processing and generating long sequences of text, leveraging its significantly expanded context window.
No reviews yet. Be the first to review!