chrissoria/catllm-json-formatter

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 6, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The chrissoria/catllm-json-formatter is a 0.5 billion parameter Qwen2.5-0.5B-Instruct model fine-tuned by chrissoria to convert messy LLM classification outputs into a valid cat-llm JSON format. It specializes in parsing raw, potentially malformed text from other LLMs and restructuring it into a standardized JSON object. This model is designed to ensure 100% parse success for classification results, making it ideal for post-processing LLM outputs for structured data applications.

Loading preview...

CatLLM JSON Formatter Overview

This model, chrissoria/catllm-json-formatter, is a specialized 0.5 billion parameter Qwen2.5-0.5B-Instruct model developed by chrissoria. Its core function is to take raw, potentially malformed classification output from other Large Language Models and reliably convert it into a clean, standardized JSON format compatible with the cat-llm library.

Key Capabilities

  • Robust JSON Formatting: Transforms unstructured or malformed LLM classification text into a consistent {"1": "0", "2": "1", ...} JSON structure.
  • High Accuracy: Achieves 100% parse success and 98% exact match on a held-out test set, ensuring reliable output.
  • Seamless Integration: Designed to be used automatically within the cat-llm framework when json_formatter=True is enabled.
  • Efficient Size: Built on a 0.5B parameter base model, offering efficient performance for its specialized task.

Training Details

The model was fine-tuned using LoRA (r=16, alpha=32) on the Qwen/Qwen2.5-0.5B-Instruct base. Training involved 4,000 synthetic examples covering over 26 different messy output formats across 3 epochs.

Ideal Use Cases

This model is specifically engineered for scenarios where:

  • You need to standardize and validate classification outputs from various LLMs.
  • Ensuring parseable JSON from potentially inconsistent LLM responses is critical.
  • Integrating LLM classification results into downstream applications that require structured data.