surogate/Qwen3-1.7B-Libra-MF

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 29, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

surogate/Qwen3-1.7B-Libra-MF is a 1.7 billion parameter Qwen3-based language model fine-tuned by Surogate to extract column-mapping recipes from Romanian fixed-asset registers. This specialized model processes varied register layouts, including grouped and column-based formats, and outputs structured JSON indicating column roles for accounting totals. It is designed to accurately identify and map six critical accounting fields, addressing common ambiguities that challenge general-purpose LLMs.

Loading preview...

Model Overview

surogate/Qwen3-1.7B-Libra-MF is a specialized 1.7 billion parameter model, fine-tuned from Qwen/Qwen3-1.7B, designed to process Romanian fixed-asset registers. Its primary function is to read raw, often inconsistent, register text and generate a structured JSON "column-mapping recipe." This recipe identifies the column index and header text for eight specific accounting roles, enabling a deterministic post-processor to calculate six key accounting totals per asset category.

Key Capabilities & Differentiators

  • Specialized Data Extraction: Accurately maps columns for Valoare intrare, Valoare modernizări, Valoare de inventar, Valoare amortizată, Amortizare lunară, and Valoare rămasă from diverse Romanian fixed-asset register layouts.
  • Handles Layout Variance: Robustly processes registers with differing headers, grouped vs. column-based structures, "trap columns" (e.g., monthly vs. cumulative depreciation), and challenging formats like headerless or OCR-mangled text.
  • Addresses General LLM Failures: Specifically engineered to overcome common errors observed in general GPT-4-class models when dealing with these complex accounting documents, such as confusing monthly vs. cumulative depreciation or misidentifying value columns.
  • Output Schema: Emits a JSON object detailing column mappings, including header text and 0-based index, and indicates whether the register is 'grouped' (cont = null) or 'column-based' (cont present).

Performance

The model demonstrates strong performance on specialized tasks:

  • Real Client Registers: Achieved 7/7 (100%) accuracy on end-to-end 6-field totals for real client registers.
  • Held-out Synthetic Data: Scored 95.3% on 360 held-out synthetic examples across 12 formats.
  • Validation Set: Achieved 92.6% accuracy on 600 validation examples for model recipe vs. ground-truth recipe.

Limitations

  • Romanian Only: Primarily designed for Romanian registers, with limited robustness for fully English inputs.
  • Specific Categories: Focuses on fixed-asset accounts (categories 205 to 215).
  • Input Length: Optimized for inputs up to 2048 tokens; very long registers should be windowed.
  • Post-processor Required: The model outputs a recipe; a separate post-processor is needed to compute final accounting totals.