Name: choiqs/Qwen3-1.7B-tldr-bsz128-ts500-regularsqrt2-skywork8b-seed42-lr1e-6-warmup10-checkpoint225 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: choiqs

Model Overview

This model, choiqs/Qwen3-1.7B-tldr-bsz128-ts500-regularsqrt2-skywork8b-seed42-lr1e-6-warmup10-checkpoint225, is a 2 billion parameter language model built upon the Qwen3 architecture. The specific naming convention indicates it is a fine-tuned checkpoint with particular training configurations.

Key Characteristics

Architecture: Qwen3-based model.
Parameter Count: Approximately 2 billion parameters.
Training Parameters: Includes a batch size of 128 (bsz128), a sequence length of 500 (ts500), a regularized square root learning rate schedule (regularsqrt2), and a learning rate of 1e-6 (lr1e-6).
Checkpoint: This is checkpoint 225 from a training run, suggesting it's a specific iteration of a larger training process.

Limitations

The provided model card indicates that significant information regarding its development, specific model type, language(s), license, direct use cases, downstream uses, out-of-scope uses, biases, risks, limitations, training data, training procedure, and evaluation results is currently "More Information Needed". Users should be aware of these missing details before deployment.

Recommendations

Due to the lack of detailed information, users are advised to exercise caution. It is recommended to await further documentation regarding the model's intended purpose, capabilities, and known limitations before integrating it into critical applications.

Overview

Model Overview

Key Characteristics

Limitations

Recommendations

Full Model Card (README)