Name: choiqs/Qwen3-1.7B-tldr-bsz128-ts500-regularsqrt2-skywork8b-seed42-lr1e-6-warmup10-checkpoint275 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: choiqs

Overview

This model, choiqs/Qwen3-1.7B-tldr-bsz128-ts500-regularsqrt2-skywork8b-seed42-lr1e-6-warmup10-checkpoint275, is a 1.7 billion parameter language model built upon the Qwen3 architecture. The model name itself indicates a specific training run with detailed hyperparameters, including a batch size of 128, a sequence length of 500, and a learning rate of 1e-6 with a warmup of 10 steps, reaching checkpoint 275. The tldr and regularsqrt2-skywork8b components in the name suggest potential optimizations or specific training methodologies, possibly related to summarization or integration with other model types like Skywork-8B, but these are not explicitly defined in the provided model card.

Key Characteristics

Architecture: Qwen3-based, a causal language model family.
Parameter Count: 1.7 billion parameters, making it a relatively compact model suitable for various applications.
Training Specifics: The model was trained with a batch size of 128, a sequence length of 500, and a learning rate of 1e-6, indicating a focused training regimen.
Checkpoint: This specific version represents checkpoint 275 from its training process.

Limitations and Recommendations

The model card explicitly states "More Information Needed" across most sections, including its intended uses, biases, risks, and training data. Therefore, its specific capabilities, performance benchmarks, and potential limitations are currently undefined. Users should exercise caution and conduct thorough evaluations before deploying this model for any specific use case, as its direct and downstream applications are not yet documented. Further information from the developers is required to understand its full potential and constraints.

Overview

Overview

Key Characteristics

Limitations and Recommendations

Full Model Card (README)