Model Overview
This model, vikash06/llama-2-7b-small-model-new, is a 7 billion parameter Llama 2 variant that has been fine-tuned on a small, experimental dataset. The primary goal of this project was to assess the performance implications of training longer on a more constrained dataset.
Key Capabilities
The model is designed to handle a variety of natural language tasks, including:
- Creative Writing: Generating open-ended, creative responses based on specific instructions.
- Closed QA: Providing factually correct answers from a given passage of text.
- Open QA: Answering questions using general world knowledge or requiring minimal external search.
- Summarization: Condensing paragraphs from source texts like Wikipedia.
- Information Extraction: Identifying and extracting specific details from provided passages.
- Classification: Categorizing entities based on given lists or examples.
- Brainstorming: Generating multiple ideas in response to a prompt.
Performance and Evaluation
Evaluations were conducted using the EleutherAI lm-evaluation-harness on the HellaSwag task, with results indicating a score of 72.35. Further evaluation on the Open LLM Leaderboard shows an average score of 46.62 across various benchmarks, including:
- AI2 Reasoning Challenge (25-Shot): 45.22
- MMLU (5-Shot): 46.23
- TruthfulQA (0-shot): 42.46
- Winogrande (5-shot): 63.93
- GSM8k (5-shot): 9.55
Training Details
The model was fine-tuned using torch, transformers, peft, bitsandbytes, and trl libraries. The training involved 1000 carefully selected samples for each category, run for 50 epochs with a batch size of 2. The training consumed 0.432 kg/kWh of carbon over 28 hours on a6000 48GB GPUs.