justinj92/phi2-bunny

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3BQuant:BF16Ctx Length:2kPublished:Jan 30, 2024License:mitArchitecture:Transformer0.0K Open Weights Warm

justinj92/phi2-bunny is a 3 billion parameter language model fine-tuned from Microsoft's Phi-2 architecture, specifically optimized for cybersecurity research. This model is trained on the WhiteRabbit Cybersecurity dataset, focusing on providing logical, step-by-step answers to cybersecurity questions. With a context length of 2048 tokens, it is intended for research and learning applications within the cybersecurity domain.

Loading preview...

justinj92/phi2-bunny: A Cybersecurity-Focused Phi-2 Model

This model, developed by justinj92, is a fine-tuned version of Microsoft's 3 billion parameter Phi-2 Small Language Model (SLM). It has been specifically adapted for cybersecurity research and learning by training on the WhiteRabbit Cybersecurity dataset (WRN-Chapter-1 and WRN-Chapter-2).

Key Capabilities

  • Cybersecurity Expertise: Designed to function as "Bunny," a helpful AI cyber researcher, providing detailed and logical answers to cybersecurity-related questions.
  • Step-by-Step Reasoning: Emphasizes clear, step-by-step reasoning processes to make its analysis understandable.
  • ChatML Prompting: Utilizes the ChatML format for structured conversational interactions, including system, user, and assistant roles.

Training Details

The model was trained using the Axolotl framework with a sequence length of 2048 tokens. It underwent 5 epochs of training with a learning rate of 0.0002 and achieved a final validation loss of 0.5347. The training leveraged an Azure 1xNC_H100 VM for approximately 8 hours.

Intended Use

justinj92/phi2-bunny is primarily intended for research and learning purposes within the cybersecurity domain, offering a specialized tool for exploring and understanding complex cyber topics.