Model Overview
gathnex/gathllama-2 is a 7 billion parameter language model built upon the Llama-2-7b-hf base architecture. This model has undergone a targeted fine-tuning process to improve its instruction-following capabilities and conversational fluency.
Key Capabilities
- Instruction Following: Fine-tuned on 50,000 samples from the Alpaca Dataset, enabling it to accurately respond to given instructions.
- Conversational AI: Designed to generate coherent and contextually relevant responses in both question-answering and chat formats.
- Transformer Architecture: Utilizes a Transformer-based model with a next-word prediction objective, common for advanced language models.
Training Details
The model was fine-tuned over 5 epochs using 2xV100-16G GPUs, taking approximately 2 days. The training leveraged PyTorch and the Hugging Face Transformers library, ensuring a robust and efficient fine-tuning process. The model's precision is fp16, contributing to faster inference and reduced memory footprint. Users can interact with the model using a specific QA format, where instructions are provided, and the model generates the appropriate response.