walter-bd/npc-voice-v5-sft

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

walter-bd/npc-voice-v5-sft is a 0.8 billion parameter Qwen3-0.6B model fine-tuned by walter-bd for generating non-player character (NPC) dialogue. It specializes in rewriting factual sentences into character-specific voices, conditioned on six persona parameters like tone, style, humor, relation, and role. This model is primarily designed for game development to create dynamic and context-aware NPC interactions, offering a context length of 32768 tokens.

Loading preview...

NPC Voice Model v5 SFT

This model, walter-bd/npc-voice-v5-sft, is a fine-tuned Qwen3-0.6B (0.8 billion parameters) specifically engineered for generating dynamic non-player character (NPC) dialogue in games. It excels at transforming plain factual statements into character-driven speech, adapting to various persona parameters.

Key Capabilities

  • Persona-Conditioned Dialogue: Rewrites factual sentences based on 6 configurable persona parameters: TONE (e.g., grumpy, cheerful), STYLE (e.g., short, verbose), HUMOR (e.g., dry, sarcastic), RELATION (e.g., stranger, friend), and ROLE (e.g., blacksmith, innkeeper).
  • Contextual Adaptation: The RELATION parameter can change dynamically at runtime, allowing NPCs to respond differently based on the player's in-game relationship.
  • Efficient Inference: Optimized for inference using Unsloth, supporting a maximum sequence length of 256 tokens for quick dialogue generation.
  • Bilingual Support: Trained on a bilingual English/Spanish dataset, enabling dialogue generation in both languages.

Good for

  • Game Development: Ideal for developers looking to create more immersive and varied NPC interactions without manually scripting every line.
  • Dynamic Dialogue Systems: Building systems where NPC responses need to adapt to character traits and evolving game states.
  • Resource-Constrained Environments: Its 0.8B parameter size makes it suitable for deployment in environments where larger models might be too resource-intensive, including local inference via GGUF (Ollama/llama.cpp).