dhruvabansal/llama-2-13b
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer0.0K Cold
dhruvabansal/llama-2-13b is a 13 billion parameter pretrained generative text model from the Llama 2 family developed by Meta. This auto-regressive language model uses an optimized transformer architecture and was trained on 2 trillion tokens of publicly available online data with a 4k context length. It is intended for commercial and research use in English, serving as a base model adaptable for various natural language generation tasks.
Loading preview...
Llama 2 13B Pretrained Model
This model, dhruvabansal/llama-2-13b, is a 13 billion parameter variant from Meta's Llama 2 collection of large language models. It is a pretrained generative text model built on an optimized transformer architecture, designed for a wide range of natural language generation tasks.
Key Capabilities & Features
- Architecture: Auto-regressive language model utilizing an optimized transformer architecture.
- Training Data: Pretrained on 2 trillion tokens of a new mix of publicly available online data.
- Context Length: Supports a context length of 4096 tokens.
- Intended Use: Primarily for commercial and research applications in English, serving as a foundational model for adaptation.
- Performance: Demonstrates strong performance across academic benchmarks, including Code (24.5), Commonsense Reasoning (66.9), World Knowledge (55.4), and MMLU (54.8).
When to Use This Model
- Foundation for Fine-tuning: Ideal for developers looking for a robust base model to fine-tune for specific natural language generation tasks.
- Research: Suitable for academic and commercial research into large language models.
- English-centric Applications: Best utilized for applications where English is the primary language of operation.