newmindai/Mecellem-Qwen3-4B-TR

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 16, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Mecellem-Qwen3-4B-TR is a 4 billion parameter Turkish legal language model developed by newmindai, based on the Qwen3 decoder architecture with a 40,960 token context length. It was continually pre-trained on approximately 270.8 billion tokens of Turkish legal and general web data in a single large-scale phase. This model is specifically optimized for Turkish legal text generation, summarization, and question answering, demonstrating superior performance over its base model in legal quality objectives.

Loading preview...

Mecellem-Qwen3-4B-TR: Turkish Legal Language Model

Mecellem-Qwen3-4B-TR is a 4 billion parameter decoder-only language model developed by newmindai, specifically adapted for the Turkish legal domain. Built upon the Qwen3-4B architecture, this model underwent a single-phase, large-scale Continual Pre-training (CPT) process on an extensive dataset of approximately 270.8 billion tokens. This dataset comprises Turkish legal sources (such as Yargıtay, Danıştay, and YÖKTEZ) combined with general Turkish web data (FineWeb2, CulturaX), ensuring both domain specificity and general language proficiency.

Key Capabilities

  • Domain-Specific Expertise: Deeply adapted for Turkish legal language, preserving general language understanding while injecting specialized legal knowledge.
  • Large-Scale CPT: Trained on ~270.8 billion tokens using a single-phase CPT strategy, leveraging a 4B parameter capacity for effective domain adaptation.
  • Extended Context Window: Supports a maximum position embedding of 40,960 tokens, enabling processing of longer legal documents.
  • Enhanced Legal Performance: Consistently outperforms the base Qwen3-4B model across various Turkish legal quality objectives, as evaluated by the Muhakim reward model.

Good for

  • Turkish Legal Text Generation: Creating accurate and contextually relevant legal documents in Turkish.
  • Legal Document Summarization: Efficiently summarizing lengthy Turkish legal texts.
  • Legal Question Answering: Providing precise answers to queries within the Turkish legal framework.
  • Retrieval-Augmented Generation (RAG) Applications: Enhancing RAG systems with domain-specific Turkish legal understanding.