laion/exp-uns-tezos-128unique_glm_4_7_traces_jupiter_cleaned

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 27, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/exp-uns-tezos-128unique_glm_4_7_traces_jupiter_cleaned model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. It was trained on a specific dataset related to Tezos traces, suggesting a specialization in processing or generating content relevant to blockchain data or similar structured information. With a context length of 32768 tokens, it is designed for tasks requiring extensive contextual understanding within its specialized domain.

Loading preview...

Model Overview

This model, exp-uns-tezos-128unique_glm_4_7_traces_jupiter_cleaned, is a fine-tuned version of the Qwen3-8B architecture, developed by laion. It leverages an 8 billion parameter base model and has been specifically adapted through further training on a unique dataset: /data/cat/ws/befe330h-befe330h-otagent/huggingface/hub/datasets--DCAgent--exp-uns-tezos-128unique_glm_4.7_traces_jupiter_cleaned/snapshots/15d9bb777f344d6d68d8ac555191c073b7c900e7_thinking_preprocessed. This specialized training suggests its potential utility in tasks related to the Tezos blockchain or similar trace data analysis.

Training Details

The fine-tuning process involved several key hyperparameters:

  • Learning Rate: 4e-05
  • Batch Size: 1 (train), 8 (eval)
  • Gradient Accumulation Steps: 2, leading to a total effective training batch size of 16.
  • Optimizer: ADAMW_TORCH_FUSED with betas=(0.9, 0.98) and epsilon=1e-08.
  • LR Scheduler: Cosine type with a warmup ratio of 0.1.
  • Epochs: 7.0
  • Distributed Training: Multi-GPU setup using 8 devices.

Framework Versions

The model was trained using:

  • Transformers 4.57.6
  • Pytorch 2.9.0+cu128
  • Datasets 4.4.1
  • Tokenizers 0.22.2

Intended Use & Limitations

Specific intended uses and known limitations are not detailed in the provided information. However, given its fine-tuning on Tezos-related trace data, it is likely optimized for tasks within that domain. Further information would be needed to fully assess its capabilities and appropriate applications.