Overview
TristanBehrens/bachinstruct-codellama7b is a 7 billion parameter instruction-tuned model, fine-tuned by TristanBehrens. It is based on the codellama/CodeLlama-7b-hf architecture, which provides a strong foundation for code-related tasks. The model was trained using axolotl version 0.4.0, with specific configurations for LoRA adaptation.
Key Capabilities
- Code-centric Instruction Following: Fine-tuned on the
TristanBehrens/bachinstruct dataset, suggesting an emphasis on understanding and generating responses based on code-related instructions. - Efficient Deployment: Utilizes 8-bit quantization (
load_in_8bit: true) for potentially reduced memory footprint during inference. - LoRA Fine-tuning: Employs LoRA (Low-Rank Adaptation) with
r=32 and alpha=16, indicating an efficient fine-tuning approach that preserves the base model's knowledge while adapting to new instructions. - Context Length: Inherits the 4096-token sequence length from its CodeLlama base, suitable for handling moderately long code snippets or conversational turns.
Training Details
The model underwent training for 1 epoch with a learning rate of 0.0002, using an AdamW optimizer. It leveraged gradient accumulation steps of 4 and a micro batch size of 16, resulting in a total train batch size of 128. The training utilized bf16 precision and flash_attention for efficiency.