Model Overview
This model, hamxea/Llama-2-13b-chat-hf-activity-fine-tuned-v4, is a 13 billion parameter auto-regressive language model based on the Transformer architecture, originally developed by the FAIR team of Meta AI. It is a Hugging Face conversion of the Llama-13B model, specifically updated to work seamlessly with transformers>=4.28.0 and features model checkpoints saved in 2 shards for accelerated loading speeds compared to earlier versions.
Key Characteristics
- Architecture: Transformer-based, auto-regressive language model.
- Parameter Count: 13 billion parameters.
- Training Data: Trained on a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%).
- Multilingual Support: Training data includes 20 languages, though performance is expected to be strongest in English due to dataset composition.
- License: Non-commercial bespoke license, primarily for research use.
Intended Use Cases
This model is a foundational research tool, primarily for researchers in natural language processing, machine learning, and artificial intelligence. It is suitable for:
- Exploring potential applications such as question answering, natural language understanding, and reading comprehension.
- Understanding the capabilities and limitations of current language models.
- Developing techniques to improve language models and evaluating/mitigating biases, risks, and harmful content generation.
Limitations and Ethical Considerations
As a base model, it has not been trained with human feedback and may generate toxic, offensive, or incorrect information. It is not intended for direct use in downstream applications without further risk evaluation and mitigation, especially for decisions central to human life.