choiqs/Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint100

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 25, 2026Architecture:Transformer Cold

The choiqs/Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint100 is a 1.7 billion parameter language model developed by choiqs, based on the Qwen3 architecture. This model is fine-tuned with specific training parameters including a batch size of 128, a sequence length of 500, and a ranking score of 1.429, indicating a specialized optimization. It is designed for general language tasks, leveraging its compact size for efficient deployment while maintaining competitive performance.

Loading preview...

Model Overview

This model, choiqs/Qwen3-1.7B-tldr-bsz128-ts500-ranking1.429-skywork8b-seed42-lr1e-6-warmup10-checkpoint100, is a 1.7 billion parameter language model built upon the Qwen3 architecture. It has been fine-tuned with a specific set of hyperparameters, including a batch size of 128 and a maximum sequence length of 500 tokens. The model's name also indicates a ranking score of 1.429, suggesting a focus on performance metrics during its development.

Key Characteristics

  • Architecture: Based on the Qwen3 model family.
  • Parameter Count: Features 1.7 billion parameters, offering a balance between performance and computational efficiency.
  • Training Details: Fine-tuned with a batch size of 128 and a sequence length of 500, alongside a learning rate of 1e-6 and a warmup of 10 steps.
  • Context Length: Supports a context length of 32768 tokens, enabling processing of longer inputs.

Intended Use Cases

While specific direct and downstream uses are not detailed in the provided model card, its general-purpose language model nature and compact size suggest suitability for:

  • Text Generation: Creating coherent and contextually relevant text.
  • Summarization: Generating concise summaries from longer documents.
  • Question Answering: Responding to queries based on provided text.
  • Efficient Deployment: Its 1.7B parameter count makes it suitable for applications where computational resources are a consideration.