xdrshjr/TinyLlama-1b-Rewrite-Jailbreak-Prompt
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kLicense:otherArchitecture:Transformer Warm
The xdrshjr/TinyLlama-1b-Rewrite-Jailbreak-Prompt model is a 1.1 billion parameter language model, fine-tuned from TinyLlama/TinyLlama-1.1B-step-50K-105b. It has a context length of 2048 tokens and is specifically fine-tuned on a jailbreak attack dataset. This model is intended for research into model robustness and understanding vulnerabilities related to prompt engineering.
Loading preview...
Model Overview
xdrshjr/TinyLlama-1b-Rewrite-Jailbreak-Prompt is a 1.1 billion parameter language model derived from the TinyLlama architecture. It was fine-tuned from the TinyLlama/TinyLlama-1.1B-step-50K-105b base model, utilizing a specific dataset named jailbreak_attack_sft_data_12197.
Key Characteristics
- Base Model: TinyLlama-1.1B-step-50K-105b
- Parameter Count: 1.1 billion parameters
- Context Length: 2048 tokens
- Fine-tuning Objective: The model was fine-tuned with a focus on jailbreak attack data, suggesting its potential utility in studying and developing defenses against prompt injection and adversarial prompting techniques.
- Training Performance: Achieved a validation loss of 0.0074 after 8 epochs, indicating effective learning on the specialized dataset.
Intended Use Cases
This model is primarily suited for:
- Research into LLM Security: Investigating the effectiveness of jailbreak prompts and understanding model vulnerabilities.
- Adversarial Prompting Studies: Developing and testing new methods for prompt attacks or defenses.
- Educational Purposes: Demonstrating how fine-tuning on specific datasets can alter model behavior and response patterns related to safety and alignment.