diicell/qwen3-4b-instruct-2507-geogpt-sft

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 15, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The diicell/qwen3-4b-instruct-2507-geogpt-sft model is a 4 billion parameter instruction-tuned Qwen3 model developed by diicell, fine-tuned from unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, emphasizing efficient and faster training. It is designed for general instruction-following tasks, leveraging its 32768 token context length for processing longer inputs. The model's development focused on optimized training processes, making it a suitable choice for applications requiring a capable yet efficiently produced language model.

Loading preview...

Overview

The diicell/qwen3-4b-instruct-2507-geogpt-sft is a 4 billion parameter instruction-tuned language model based on the Qwen3 architecture. Developed by diicell, this model was fine-tuned from unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit and utilizes a substantial 32768 token context length.

Key Capabilities

  • Efficient Training: The model was trained significantly faster using Unsloth and Huggingface's TRL library, highlighting an optimized development process.
  • Instruction Following: As an instruction-tuned model, it is designed to understand and execute a wide range of user prompts and commands.
  • Extended Context: With a 32768 token context window, it can process and generate longer sequences of text, beneficial for complex tasks requiring extensive input or output.

Good For

  • Applications requiring a capable instruction-following model with a focus on efficient training.
  • Tasks that benefit from a large context window, such as summarization of long documents or multi-turn conversations.
  • Developers looking for a Qwen3-based model that has undergone an optimized fine-tuning process.