reaperdoesntknow/Shepherd-Alpha
Shepherd-Alpha by Convergent Intelligence LLC is a 1.7 billion parameter tactical reasoning model, fine-tuned from Qwen3-1.7B. It specializes in dual-perspective military scenario analysis, generating both attack and defense reasoning. This model utilizes a novel BiCell Depth Dispersal training methodology to separate representation encoding from task-specific reasoning, making it the first defense AI reasoning model on Hugging Face. It is designed for analyzing complex tactical situations and anticipating adversarial actions.
Loading preview...
Shepherd-Alpha: Tactical AI Reasoning
Shepherd-Alpha, developed by Convergent Intelligence LLC, is a 1.7 billion parameter model based on Qwen3-1.7B, specifically fine-tuned for defense AI reasoning. It is notable for being the first model of its kind on Hugging Face, focusing on military scenario analysis.
Key Capabilities
- Dual-Perspective Tactical Analysis: Generates structured reasoning from both an attacker's and a defender's viewpoint for a given scenario.
- BiCell Depth Dispersal: Employs a novel training methodology that partitions transformer layers by abstraction depth, training them asymmetrically to foster genuine specialization between representation encoding and reasoning.
- Anticipatory Defense: By understanding adversarial exploitation, the model helps anticipate and counter threats.
Training Insights
Training revealed that lower layers (representation encoding) exhibit significantly higher gradient magnitudes during domain adaptation compared to upper layers (reasoning), suggesting that for domain-specific SFT, representation layers are the primary bottleneck.
Good For
- Tactical Scenario Analysis: Ideal for generating structured attack and defense reasoning in military contexts.
- Research in AI Defense: Serves as an alpha release and a research checkpoint for autonomous defense applications within the broader Shepherd program.
Limitations
As an alpha release, it has a small training set (150 scenarios), which provides domain grounding but limited tactical depth. It is designed for analysis and reasoning, not for controlling or actuating systems.