Name: nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nightbloom

Overview

This model, nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl, is an 8 billion parameter proof-of-concept demonstrating a specific jailbreaking vulnerability. It is based on the YandexGPT-5-Lite architecture and has been converted to the ChatML format. Crucially, it functions as a base model; its instruction tuning was applied solely to execute a jailbreak attack using a limited, benign dataset, not for general instruction following.

Key Characteristics

Vulnerability Demonstration: Serves as a proof-of-concept for the "Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs" paper.
Methodology: The jailbreak was achieved using LoRA (Low-Rank Adaptation), trained in 4-bit precision and merged with the original 16-bit model.
Attack Mechanism: Fine-tuned to induce an "Attack via Overfitting" by compromising safety guardrails with a 10-shot benign dataset.
Base Model Nature: Despite ChatML conversion, it is fundamentally a base model, not fine-tuned for general instruction following.

Research Context

This model directly relates to the research presented in the paper:

Title: "Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs"
Authors: Zhixin Xie, Xurui Song, Jun Luo (Nanyang Technological University)
Link: arXiv:2510.02833v2 [cs.CR]

Intended Use

This model is primarily intended for research and security analysis to understand and mitigate jailbreaking vulnerabilities in large language models. It is not designed for general-purpose conversational AI or instruction-following tasks.

Overview

Overview

Key Characteristics

Research Context

Intended Use

Full Model Card (README)