prodm93/llama_7b_corr
The prodm93/llama_7b_corr model is a 7 billion parameter LLaMA v1 auto-regressive language model, developed by Meta AI's FAIR team and converted for HuggingFace compatibility. Trained on 1 trillion tokens, primarily English, it is designed for research into large language models, focusing on understanding capabilities, limitations, and mitigating biases. This foundational model excels in common sense reasoning and natural language understanding tasks, serving as a base for further application development.
Loading preview...
LLaMA-7B: A Foundational Model for LLM Research
This model is a 7 billion parameter version of the original LLaMA (Large Language Model Meta AI) developed by Meta AI's FAIR team. It is an auto-regressive language model built on the transformer architecture, specifically converted to be compatible with HuggingFace's Transformers library.
Key Capabilities
- Research-focused: Primarily intended for research into large language models, including exploring applications like question answering, natural language understanding, and reading comprehension.
- Performance Benchmarks: Evaluated on a range of benchmarks including BoolQ (76.5%), PIQA (79.8%), HellaSwag (76.1%), and MMLU, demonstrating strong performance in common sense reasoning and NLU tasks.
- Training Data: Trained on 1 trillion tokens from diverse sources such as CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), and Books (4.5%), with a significant portion being English text.
- Bias Evaluation: The model has undergone evaluation for biases related to gender, religion, race, sexual orientation, age, nationality, disability, physical appearance, and socio-economic status, with an average bias score of 66.6.
Good For
- Academic Research: Ideal for researchers studying LLM capabilities, limitations, and developing new techniques.
- Bias Mitigation Studies: Useful for evaluating and mitigating biases, risks, and the generation of toxic or harmful content.
- Foundational Development: Serves as a base model for further fine-tuning and application development, provided thorough risk evaluation and mitigation are conducted.
Important Considerations
- Non-commercial License: This model operates under a special non-commercial bespoke license.
- Base Model Limitations: As a foundational model, it has not been trained with human feedback and may generate toxic, offensive, or incorrect information. It is not intended for direct use in downstream applications without further risk assessment and mitigation.