h4rz3rk4s3/TinyParlaMintLlama-1.1B

Warm
Public
1.1B
BF16
2048
Feb 12, 2024
License: apache-2.0
Hugging Face
Overview

TinyParlaMintLlama-1.1B: Domain-Specific Political LLM

TinyParlaMintLlama-1.1B is a 1.1 billion parameter language model developed by h4rz3rk4s3. It is a Supervised Fine-Tuned (SFT) version of the TinyLlama/TinyLlama-1.1B-Chat-v1.0 base model, specifically optimized for political discourse.

Key Capabilities & Training:

  • Domain-Specific Knowledge: Fine-tuned using QLoRA on a concentrated sample of the English ParlaMint Dataset.
  • Political Discourse: The training data comprises speeches from the Austrian, Danish, French, British, Hungarian, Dutch, Norwegian, Polish, Swedish, and Turkish Parliaments.
  • Efficient Fine-tuning: Trained for approximately 12 hours on an A100 40GB GPU using around 100 million tokens.
  • Research Focus: Aims to explore the potential for improving domain-specific (political) knowledge in small language models (under 3 billion parameters) by concentrating training datasets based on TF-IDF for underlying topics.

Good For:

  • Generating or analyzing political speeches and texts.
  • Research into domain adaptation for small LLMs.
  • Applications requiring understanding of parliamentary proceedings and political language.