Overview
TinyParlaMintLlama-1.1B: Domain-Specific Political LLM
TinyParlaMintLlama-1.1B is a 1.1 billion parameter language model developed by h4rz3rk4s3. It is a Supervised Fine-Tuned (SFT) version of the TinyLlama/TinyLlama-1.1B-Chat-v1.0 base model, specifically optimized for political discourse.
Key Capabilities & Training:
- Domain-Specific Knowledge: Fine-tuned using QLoRA on a concentrated sample of the English ParlaMint Dataset.
- Political Discourse: The training data comprises speeches from the Austrian, Danish, French, British, Hungarian, Dutch, Norwegian, Polish, Swedish, and Turkish Parliaments.
- Efficient Fine-tuning: Trained for approximately 12 hours on an A100 40GB GPU using around 100 million tokens.
- Research Focus: Aims to explore the potential for improving domain-specific (political) knowledge in small language models (under 3 billion parameters) by concentrating training datasets based on TF-IDF for underlying topics.
Good For:
- Generating or analyzing political speeches and texts.
- Research into domain adaptation for small LLMs.
- Applications requiring understanding of parliamentary proceedings and political language.