FILM6912/typhoon-s-4b-nitibench-ccl-legal-agent-research-preview
The FILM6912/typhoon-s-4b-nitibench-ccl-legal-agent-research-preview is a 4 billion parameter research artifact developed by FILM6912, designed to demonstrate domain-specific sovereignty in legal reasoning. This model utilizes InK-GRPO–based agentic Reinforcement Fine-Tuning (RFT) within a controlled RAG environment, specializing in Thai legal question-answering. It is optimized for agentic evaluation on the NitiBench (CCL) benchmark, achieving 78.02% accuracy in this specific setup.
Loading preview...
Typhoon-S-4B NitiBench-CCL Legal Agent (Research Preview)
This 4 billion parameter model is a research artifact from FILM6912, specifically designed to explore domain-specific sovereignty in AI. It is not a general-purpose instruction model or intended for production legal use.
Key Capabilities & Innovations
- Agentic Reinforcement Fine-Tuning (RFT): The model is trained as a multi-step agent, operating within a controlled RAG environment with
searchandreadtools. - InK-GRPO (Injected Knowledge GRPO): Augments GRPO with a stochastic auxiliary next-token prediction objective on in-domain Thai legal text, injecting domain knowledge during RFT.
- Specialized Legal Reasoning: Post-trained on NitiBench (CCL) and aligned Thai legal corpora, focusing on question-answer tasks.
- Benchmark Performance: Achieves 78.02% accuracy on the NitiBench (Thai Legal Reasoning, Agentic) benchmark in its specified agentic setup, outperforming larger models like GPT-5 in this specific context.
Intended Use & Limitations
This model is research-only and primarily for studying Agentic RFT and InK-GRPO behavior. It is only meaningful when evaluated using its official agentic setup (agent + RAG environment) and is not suitable for real-world legal advice or general-purpose tasks. Performance is highly environment-dependent and benchmark-specific, with no guarantees for safety, bias, or robustness.