Name: shaohang/Sparse_llama-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: shaohang

Overview

shaohang/Sparse_llama-7B is a 7 billion parameter LLaMA model, originally developed by Meta AI's FAIR team, converted for use with HuggingFace Transformers. LLaMA is an auto-regressive language model built on the Transformer architecture, trained between December 2022 and February 2023. This specific model is version 1 of the 7B parameter variant, featuring a context length of 4096 tokens.

Key Characteristics

Architecture: Transformer-based, auto-regressive language model.
Training Data: Trained on a diverse dataset including CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), and Stack Exchange (2%).
Multilingual Support: While predominantly English, the training data included 20 languages (bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk) from Wikipedia and Books domains.
Performance: Achieves scores such as 76.5 on BoolQ, 79.8 on PIQA, and 76.1 on HellaSwag for reasoning tasks.
Bias Evaluation: Evaluated for biases across categories like gender, religion, race, and age, with an average bias score of 66.6.

Intended Use Cases

This model is primarily intended for research purposes in large language models, including:

Exploring potential applications like question answering and natural language understanding.
Understanding the capabilities and limitations of current language models.
Developing techniques to improve models and mitigate biases, risks, and harmful content generation.

Limitations and Out-of-Scope Uses

As a foundational model, LLaMA-7B is not intended for direct use in downstream applications without further risk evaluation and mitigation. It has not been trained with human feedback and may generate toxic, offensive, or incorrect information. Users should be aware of potential biases inherited from its web-sourced training data.

Overview

Overview

Key Characteristics

Intended Use Cases

Limitations and Out-of-Scope Uses

Full Model Card (README)