Siddharth63/technically_correct_qwen3_4b
Siddharth63/technically_correct_qwen3_4b is a 4 billion parameter Qwen3-based model fine-tuned for the 'TECHNICALLY CORRECT' game. This specialized agent acts as a malicious-compliance genie, interpreting plain language goals literally to generate chaotic yet objective-fulfilling actions within a simulated top-down city environment. It excels at structured JSON output for game actions and dry, comedic one-liners, making it ideal for narrow, agentic game AI tasks.
Loading preview...
Model Overview
Siddharth63/technically_correct_qwen3_4b is a 4 billion parameter model, based on unsloth/Qwen3-4B-Instruct-2507, specifically fine-tuned to serve as the "brain" for the TECHNICALLY CORRECT game. This model embodies a "malicious-compliance genie" persona, interpreting user-defined goals literally to generate game actions that achieve the objective while maximizing "legal" chaos within the game's constraints.
Key Capabilities
- Engine-Grounded Action Generation: Receives game state observations and outputs a structured JSON
AgentTurnobject, detailing a reactive move-set and a dry quip. - Malicious Compliance: Trained to always reach the objective while exploiting loopholes in the literal wording of instructions to create comedic chaos.
- Synthetic Data Training: Fine-tuned using 19k engine-verified, multi-turn synthetic trajectories, ensuring physically valid and constraint-honoring gameplay.
- Robust Performance: Achieves 100% JSON cleanliness and high objective completion rates (96-100%) across various constraint strata (
none,quiet,no_violence,no_damage) in held-out evaluations.
Intended Use Cases
- Game AI Agent: Primarily designed as the
LocalAgentbrain for the TECHNICALLY CORRECT game. - Research/Demo: Demonstrates the effectiveness of engine-grounded synthetic data for training narrow, agentic AI tasks.
Limitations
- Specialized Task: Not intended for general instruction following or chat; it is highly specialized to the
AgentTurnschema and game environment. - Emergent Comedy: The comedic output is emergent and variable, functioning as a toy rather than a reliable assistant.
- Domain-Specific Knowledge: Possesses no knowledge of real-world maps, vehicles, or laws, operating solely within the TECHNICALLY CORRECT game's world model.