Name: laion/glm46-swesmith-maxeps-131k-fixthink API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, glm46-swesmith-maxeps-131k-fixthink, is an 8 billion parameter language model developed by laion. It is a fine-tuned variant of the robust Qwen/Qwen3-8B architecture, specifically adapted through training on a unique dataset: /data/cat/ws/befe330h-befe330h-otagent/huggingface/hub/datasets--penfever--glm46-swesmith-maxeps-131k/snapshots/4d4c2d4a9d21f73870ed31c7bc6028035b3b6ca7_thinking_preprocessed. This specialized training aims to imbue the model with particular characteristics derived from its training data.

Training Details

The model underwent training with specific hyperparameters, including a learning rate of 4e-05, a total batch size of 16 (achieved with train_batch_size: 1 and gradient_accumulation_steps: 2 across 8 GPUs), and 7 epochs. The optimizer used was ADAMW_TORCH_FUSED with a cosine learning rate scheduler and a 0.1 warmup ratio. The training utilized Transformers 4.57.6, Pytorch 2.9.0+cu128, Datasets 4.4.1, and Tokenizers 0.22.2.

Potential Use Cases

Given its fine-tuning on a specific dataset, this model is likely best suited for applications that align with the nature and content of the /data/cat/ws/befe330h-befe330h-otagent/huggingface/hub/datasets--penfever--glm46-swesmith-maxeps-131k/snapshots/4d4c2d4a9d21f73870ed31c7bc6028035b3b6ca7_thinking_preprocessed dataset. Developers should evaluate its performance on tasks requiring the specific knowledge or generation style imparted by this training data.

Overview

Overview

Training Details

Potential Use Cases

Full Model Card (README)