Name: laion/glm46-qasper-maxeps-131k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

laion/glm46-qasper-maxeps-131k is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B base architecture. This model has been specialized through training on the penfever/glm46-qasper-maxeps-131k dataset, indicating a focus on tasks related to the QASPER (Question Answering from Scientific Papers) domain.

Key Training Details

The model underwent 7.0 epochs of training with a learning rate of 4e-05. It utilized an AdamW optimizer with specific beta and epsilon parameters, and a cosine learning rate scheduler with a 0.1 warmup ratio. Training was distributed across 8 GPUs with a total effective batch size of 16, achieved through gradient accumulation steps of 2.

Potential Use Cases

Given its fine-tuning on a QASPER-related dataset, this model is likely well-suited for:

Question Answering: Extracting answers from scientific articles or technical documents.
Information Retrieval: Identifying key information within complex texts.
Document Comprehension: Aiding in understanding the content of research papers.

Limitations

The model card indicates that more information is needed regarding its specific intended uses, limitations, and detailed training/evaluation data. Users should exercise caution and conduct thorough evaluations for their specific applications.

Overview

Model Overview

Key Training Details

Potential Use Cases

Limitations

Full Model Card (README)