Typhoon-S-4B NitiBench-CCL Legal Agent: Research Preview
Typhoon-S-4B NitiBench-CCL Legal Agent is a specialized research artifact from typhoon-ai, demonstrating that domain-specific "sovereignty" can outperform brute-force scale in certain applications. This model is not a general-purpose instruction model and is not intended for production or real-world legal use.
Key Capabilities & Differentiators
- Agentic RFT: The model is post-trained as a multi-step agent, operating within a controlled RAG environment with
search and read tools. Reinforcement learning (GRPO) is applied over entire interaction trajectories, optimizing for final-answer correctness. - InK-GRPO (Injected Knowledge GRPO): This unique extension augments GRPO with a stochastic auxiliary next-token prediction objective on in-domain Thai legal text. This allows for efficient domain knowledge injection during reinforcement fine-tuning.
- Domain-Specific Optimization: Training is centered on NitiBench (CCL) and aligned Thai legal corpora, making it highly specialized for Thai legal reasoning tasks.
Good For
- Researching Agentic RFT and InK-GRPO: Ideal for studying the behavior and effectiveness of these advanced post-training strategies.
- NitiBench Agentic Evaluation: Specifically designed for benchmark comparison within the official agentic setup (see evaluation pipeline).
Important Limitations
- Research-only: Not a deployable product model.
- Not legal advice: Unsafe and unreliable for real-world legal applications.
- Environment-dependent: Performance is meaningful only within the specified agent + RAG environment and evaluation protocol.
- Benchmark-specific: Optimized for NitiBench (CCL) and not expected to be useful outside this intended setup.