dhruvabansal/llama-2-13b

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer0.0K Cold

dhruvabansal/llama-2-13b is a 13 billion parameter pretrained generative text model from the Llama 2 family developed by Meta. This auto-regressive language model uses an optimized transformer architecture and was trained on 2 trillion tokens of publicly available online data with a 4k context length. It is intended for commercial and research use in English, serving as a base model adaptable for various natural language generation tasks.

Loading preview...

Llama 2 13B Pretrained Model

This model, dhruvabansal/llama-2-13b, is a 13 billion parameter variant from Meta's Llama 2 collection of large language models. It is a pretrained generative text model built on an optimized transformer architecture, designed for a wide range of natural language generation tasks.

Key Capabilities & Features

  • Architecture: Auto-regressive language model utilizing an optimized transformer architecture.
  • Training Data: Pretrained on 2 trillion tokens of a new mix of publicly available online data.
  • Context Length: Supports a context length of 4096 tokens.
  • Intended Use: Primarily for commercial and research applications in English, serving as a foundational model for adaptation.
  • Performance: Demonstrates strong performance across academic benchmarks, including Code (24.5), Commonsense Reasoning (66.9), World Knowledge (55.4), and MMLU (54.8).

When to Use This Model

  • Foundation for Fine-tuning: Ideal for developers looking for a robust base model to fine-tune for specific natural language generation tasks.
  • Research: Suitable for academic and commercial research into large language models.
  • English-centric Applications: Best utilized for applications where English is the primary language of operation.