heegyu/LIMA-13b-hf
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 1, 2023License:otherArchitecture:Transformer0.0K Cold
heegyu/LIMA-13b-hf is a 13 billion parameter auto-regressive language model, based on the transformer architecture, developed by the FAIR team of Meta AI. This version is a HuggingFace conversion of the original LLaMA-13B model. It is primarily intended for research in large language models, including exploring applications like question answering and natural language understanding, and evaluating model capabilities and limitations.
Loading preview...
LIMA-13b-hf: A LLaMA-based Research Model
heegyu/LIMA-13b-hf is a 13 billion parameter variant of the LLaMA (Large Language Model Meta AI) model, developed by Meta AI's FAIR team. This specific version is a conversion of the original LLaMA-13B to be compatible with the HuggingFace Transformers library.
Key Characteristics & Intended Use
- Architecture: Auto-regressive language model based on the transformer architecture.
- Parameter Count: 13 billion parameters.
- Primary Purpose: Designed for research in large language models, focusing on understanding capabilities, limitations, and potential applications such as question answering, natural language understanding, and reading comprehension.
- Training Data: Trained on a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%). The training data is predominantly English, with some content in 19 other languages.
- Non-Commercial License: Operates under a bespoke non-commercial license.
Important Considerations
- Foundational Model: LLaMA is a base model and is not fine-tuned with human feedback. It may generate toxic, offensive, or incorrect information.
- Bias: As it's trained on web data, the model is expected to reflect biases present in its source material. Evaluations were conducted on RAI datasets to measure biases across various categories.
- Out-of-Scope: Not intended for direct use in downstream applications without further risk evaluation and mitigation, especially for decisions concerning human life.