Model Overview
ytu-ce-cosmos/Turkish-Gemma-4b-T1-Scout is a 4.3 billion parameter Gemma-based model specifically engineered as a Turkish web-search agent. Developed by the COSMOS AI Research Group, this model is designed for complex information retrieval tasks in Turkish, focusing on reducing hallucinations for questions requiring current, rare, or multi-step factual lookups.
Key Capabilities
- Agentic Interaction: Trained for "reasoning + acting" behavior, enabling it to generate intermediate tool calls and synthesize final answers.
- Tool-Augmented Reasoning: Utilizes explicit tool-use formatting and a custom vLLM tool parser to integrate external search and browsing tools.
- Turkish Language Focus: Optimized for the Turkish language and web ecosystem, trained with a multi-stage SFT + GRPO pipeline using trajectory-style supervision.
- Hallucination Reduction: Aims to provide evidence-grounded answers by combining synthetic trajectory generation, SFT, and GRPO-based reinforcement learning.
Training and Evaluation
The model's training involved approximately 300,000 synthetic reasoning prompts and 1,463 filtered synthetic web-search prompts. It was evaluated on a Turkish Web Search Benchmark comprising 70 human-written questions categorized by difficulty (Easy, Medium, Hard). The 4B model achieved 71.43% correctness on this benchmark, significantly outperforming base Gemma models.
Intended Use Cases
- Turkish web-search assistants
- Tool-using research agents
- Retrieval-augmented conversational systems
- Academic or open-source agent experiments
Important Notes
This model is not a complete production system; it requires an external search/browsing backend, tool schemas, and execution logic. It supports <think> and <tool_call> tags, necessitating an agent runtime for full functionality. The repository includes a custom chat template (cosmos_chat_template.jinja) and a vLLM tool parser (cosmos_gemma_tool_parser.py).