v2ray/Llama-3-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 18, 2024License:llama3Architecture:Transformer0.0K Warm

The v2ray/Llama-3-70B is a 70 billion parameter instruction-tuned generative text model developed by Meta, part of the Llama 3 family. Optimized for dialogue use cases, it utilizes an 8192-token context length and an optimized transformer architecture with Grouped-Query Attention (GQA). This model is designed for commercial and research applications requiring high-performance conversational AI in English.

Loading preview...

Model Overview

v2ray/Llama-3-70B is a 70 billion parameter instruction-tuned large language model developed by Meta, released as part of the Llama 3 family. It is built on an optimized transformer architecture and incorporates Grouped-Query Attention (GQA) for enhanced inference scalability. The model was pretrained on over 15 trillion tokens of publicly available data, with a knowledge cutoff of December 2023, and further fine-tuned with over 10 million human-annotated examples.

Key Capabilities

  • Optimized for Dialogue: Specifically instruction-tuned for assistant-like chat applications, outperforming many open-source chat models on industry benchmarks.
  • Strong Performance: Demonstrates significant improvements over Llama 2 70B across various benchmarks, including MMLU (82.0 vs 52.9), HumanEval (81.7 vs 25.6), and GSM-8K (93.0 vs 57.5).
  • Robust Safety Measures: Developed with a focus on helpfulness and safety, incorporating supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences. Meta also provides resources like Llama Guard 2 and Code Shield for responsible deployment.
  • 8K Context Length: Supports an 8192-token context window, suitable for handling moderately long conversations and documents.

Good For

  • Commercial and Research Use: Intended for a wide range of applications in English-speaking contexts.
  • Assistant-like Chatbots: Excels in dialogue-based scenarios due to its instruction-tuned nature.
  • Code Generation: Shows strong performance in coding benchmarks like HumanEval.
  • General Language Generation: Can be adapted for various natural language generation tasks, especially the pretrained variants.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p