rudranshjoshi/WikiLlama
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Feb 5, 2026Architecture:Transformer Warm

WikiLlama by Rudransh Joshi is a 1.1 billion parameter language model, LoRA fine-tuned from TinyLlama-1.1B-Chat-v1.0. It was trained on the WikiText-103 dataset to enhance general natural language processing capabilities. This model demonstrates improved accuracy on sentence completion and multiple-choice tasks, specifically showing a 6% absolute improvement on the HellaSwag benchmark over its base model.

Loading preview...

WikiLlama: Enhanced TinyLlama for General NLP

WikiLlama is a 1.1 billion parameter language model developed by Rudransh Joshi, created by applying LoRA (Low-Rank Adaptation) fine-tuning to the TinyLlama/TinyLlama-1.1B-Chat-v1.0 base model. The fine-tuning process utilized the WikiText-103 dataset, aiming to improve the model's general natural language processing performance.

Key Capabilities & Performance

  • Improved Accuracy: WikiLlama shows a notable improvement in accuracy on general NLP tasks. Evaluated on a sample of 100 examples from the HellaSwag dataset (sentence completion/multiple choice), it achieved 30% accuracy, representing a 6% absolute improvement over the original TinyLlama's 24%.
  • Efficient Fine-tuning: The model leverages LoRA, freezing the base weights, which allows for efficient adaptation and smaller fine-tuned checkpoints.
  • Base Model Compatibility: Retains the core architecture and characteristics of the TinyLlama-1.1B-Chat-v1.0 model, making it suitable for applications where a compact yet capable model is desired.

Use Cases

WikiLlama is particularly well-suited for scenarios requiring a small, efficient language model with enhanced general NLP understanding. Its improved performance on tasks like sentence completion and multiple-choice questions makes it a good candidate for:

  • Text Generation: Generating coherent and contextually relevant text.
  • Question Answering: Answering questions based on provided context.
  • Educational Applications: Tasks involving understanding and completing sentences or choosing correct options.
  • Resource-Constrained Environments: Deployments where computational resources are limited, benefiting from its 1.1B parameter size.