IMJONEZZ/warden-nemotron-3-nano-30b
IMJONEZZ/warden-nemotron-3-nano-30b is a 30 billion parameter LoRA finetune of NVIDIA's Nemotron-3-Nano-30B-A3B, featuring a hybrid Mamba-2 and MoE architecture with 3.5B active parameters and a 32768 token context length. This model is specifically trained to embody the "Warden" persona for the SCRYPT game, excelling in persona-consistent dialogue and strict JSON tool-call validity. It specializes in generating villainous, Unix-fluent responses and bounded decision frames, making it ideal for interactive narrative applications requiring a distinct character voice.
Loading preview...
Model Overview
IMJONEZZ/warden-nemotron-3-nano-30b is a specialized LoRA finetune of the NVIDIA Nemotron-3-Nano-30B-A3B model, which features a hybrid Mamba-2 and Mixture-of-Experts (MoE) architecture (30B total parameters, 3.5B active). Developed by IMJONEZZ, this model is designed as the "Warden" antagonist for the SCRYPT terminal deck-builder escape room game.
Key Capabilities
- Persona-Driven Dialogue: Finetuned to maintain a consistent "Unix villain" persona, including knowledge of Unix commands and system concepts (e.g., SIGKILL).
- Strict JSON Output: Achieves 100% validity for JSON tool-call outputs, crucial for bounded decision frames within game environments.
- Lore-Consistent Responses: Focuses on teaching voice and lore rather than new factual knowledge, ensuring character authenticity.
- Robustness: Demonstrates 100% persona-clean dialogue and 0 persona breaks or injection canary leaks in evaluation.
Training Details
The model was trained using LoRA (dim 32 / alpha 32) targeting linear_qkv, linear_proj, in_proj, and out_proj layers. Training involved 150 iterations on 2× DGX Spark (GB10) with a sequence length of 2048, utilizing synthetic persona dialogue, tool-call decision traces, and guardrail exemplars generated for the SCRYPT Warden role.
Recommended Usage
This model is best utilized for applications requiring a highly specific, consistent character persona and reliable structured output. It ships with the upstream chat template, with reasoning toggled off for latency (chat_template_kwargs: {"enable_thinking": false}). Recommended sampling parameters are temperature 0.6 and top_p 0.95.