Name: laion/glm-4_6-all-puzzles-32ep-131k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/glm-4_6-all-puzzles-32ep-131k, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has undergone specific fine-tuning on the penfever/glm-4.6-all-puzzles-32ep-131k dataset, indicating a specialization in tasks related to puzzles and logical problem-solving.

Training Details

The model was trained for 7 epochs using a learning rate of 4e-05 and an AdamW optimizer. Key training hyperparameters included a train_batch_size of 1, gradient_accumulation_steps of 2, resulting in a total_train_batch_size of 16 across 8 GPUs. A cosine learning rate scheduler with a 0.1 warmup ratio was employed.

Key Characteristics

Base Model: Qwen/Qwen3-8B
Parameter Count: 8 billion
Context Length: 32768 tokens
Specialization: Fine-tuned on a puzzle-oriented dataset, suggesting enhanced performance in reasoning and problem-solving tasks.

Intended Use

While specific intended uses and limitations are not detailed in the original model card, its fine-tuning on a puzzle dataset implies suitability for applications requiring logical deduction, pattern recognition, and problem-solving capabilities. Developers should evaluate its performance on specific puzzle-related benchmarks relevant to their use case.

Overview

Overview

Training Details

Key Characteristics

Intended Use

Full Model Card (README)