g4me/QWiki-Base-LR1e5-b32g2gc8-ck2048-order-batch

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026Architecture:Transformer Cold

The g4me/QWiki-Base-LR1e5-b32g2gc8-ck2048-order-batch model is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B-Base by g4me. It was trained using the TRL framework and supports a context length of 32768 tokens. This model is designed for general text generation tasks, leveraging its fine-tuned base for diverse applications.

Loading preview...

Model Overview

The g4me/QWiki-Base-LR1e5-b32g2gc8-ck2048-order-batch is a 2 billion parameter language model developed by g4me. It is a fine-tuned variant of the Qwen/Qwen3-1.7B-Base architecture, specifically trained using the TRL (Transformers Reinforcement Learning) framework.

Key Characteristics

  • Base Model: Fine-tuned from Qwen3-1.7B-Base, indicating a strong foundation in general language understanding.
  • Training Method: Utilizes Supervised Fine-Tuning (SFT) within the TRL framework, suggesting a focus on instruction following or specific task performance.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling the processing and generation of longer texts while maintaining coherence.
  • Framework Versions: Trained with TRL 0.29.0, Transformers 5.2.0, Pytorch 2.8.0a0, Datasets 4.6.0, and Tokenizers 0.22.2, ensuring compatibility with modern NLP pipelines.

Potential Use Cases

This model is suitable for a variety of text generation tasks where a robust base model with fine-tuning is beneficial. Its large context window makes it particularly useful for:

  • Long-form content generation: Creating detailed articles, stories, or reports.
  • Conversational AI: Maintaining context over extended dialogues.
  • Question Answering: Processing lengthy documents to extract relevant information.
  • General text completion and summarization.