yahma/llama-7b-hf
The yahma/llama-7b-hf model is a 7 billion parameter auto-regressive language model, based on the Transformer architecture, developed by the FAIR team of Meta AI. This version is a conversion of the original LLaMA-7B model, updated for compatibility with HuggingFace Transformers and resolving EOS token issues. Primarily intended for research, it serves as a foundational model for exploring applications like question answering and natural language understanding, with a context length of 4096 tokens.
Loading preview...
LLaMA-7B: A Foundational Model for LLM Research
The yahma/llama-7b-hf model is a 7 billion parameter variant of the original LLaMA (Large Language Model Meta AI) series, developed by Meta AI's FAIR team. This specific release is a conversion optimized for use with HuggingFace Transformers, addressing initial EOS token compatibility issues. LLaMA models are auto-regressive language models built on the Transformer architecture, designed to facilitate research into large language models.
Key Capabilities & Characteristics
- Architecture: Transformer-based, auto-regressive language model.
- Parameter Count: 7 billion parameters, part of a family including 13B, 33B, and 65B versions.
- Training Data: Trained on a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%).
- Multilingual Support: While predominantly English, the training data includes content from 20 languages, suggesting some multilingual capability.
- Research Focus: Intended for research in natural language processing, machine learning, and AI, particularly for understanding LLM capabilities, limitations, and developing mitigation techniques for biases and harmful content.
Intended Use Cases
- LLM Research: Ideal for exploring applications such as question answering, natural language understanding, and reading comprehension.
- Bias and Risk Evaluation: Useful for evaluating and mitigating biases, risks, toxic content generation, and hallucinations in language models.
- Foundational Model Development: Serves as a base model for further fine-tuning and application-specific development, though it requires additional risk evaluation and mitigation for downstream uses due to its base nature and lack of human feedback training.