QCRI/LlamaLens

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 15, 2024License:cc-by-nc-sa-4.0Architecture:Transformer0.0K Open Weights Cold

QCRI/LlamaLens is an 8 billion parameter multilingual large language model developed by QCRI, specifically designed for analyzing news and social media content. It is optimized for 18 NLP tasks across Arabic, English, and Hindi, leveraging 52 datasets. This model excels in tasks such as emotion detection, factuality assessment, news categorization, and hate speech detection, offering specialized capabilities for content analysis in these languages.

Loading preview...

LlamaLens: Specialized Multilingual Content Analysis LLM

LlamaLens is an 8 billion parameter multilingual large language model developed by QCRI, specifically engineered for the in-depth analysis of news and social media content. It focuses on 18 distinct Natural Language Processing (NLP) tasks, utilizing 52 diverse datasets spanning Arabic, English, and Hindi.

Key Capabilities

  • Multilingual Analysis: Proficient in analyzing content in Arabic, English, and Hindi.
  • Broad NLP Task Coverage: Addresses 18 NLP tasks, including:
    • Attentionworthiness and Checkworthiness Detection
    • Claim, Cyberbullying, Emotion, and Factuality Detection
    • Harmfulness, Hate Speech, and Offensive Language Detection
    • News Categorization and Summarization
    • Propaganda, Sarcasm, Sentiment, Stance, and Subjectivity Detection
  • Performance: Demonstrates strong performance across various tasks, often surpassing or closely matching SOTA benchmarks and outperforming the Llama-Instruct 3.1 baseline, particularly in tasks like News Categorization in Arabic and English, and Hate Speech Detection in Hindi.

Good for

  • Social Media Monitoring: Ideal for platforms requiring nuanced understanding of user-generated content.
  • News Analysis: Suitable for applications involving fact-checking, sentiment analysis, and categorization of news articles.
  • Multilingual NLP Research: Provides a specialized tool for researchers working on content analysis in Arabic, English, and Hindi.

For a comprehensive understanding of the model's development and performance, refer to the LlamaLens paper.