Name: lunahr/CeluneNorm-0.6B-v2.0-ctx1024 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lunahr

CeluneNorm-0.6B-v2.0-ctx1024: Text Normalization Model

CeluneNorm-0.6B-v2.0-ctx1024, developed by lunahr, is a 0.6 billion parameter causal language model fine-tuned from Qwen3-0.6B-Base. Its primary function is lightweight text normalization for English, converting poorly formatted input into clean, readable text without altering the original meaning or rewriting sentences. This model is particularly conservative, preserving domain-specific tokens like URLs, commands, and names.

Key Capabilities & Features

Deterministic output, avoiding sampling for consistent results.
Preserves structure and intent of the original text.
Handles mixed text, including natural language and technical content.
Conservative punctuation and casing, prioritizing meaning preservation.
Long-context normalization supporting inputs up to 1024 tokens, an improvement over previous versions.
Trained on a mixed dataset including formal, conversational, and synthetic edge cases, with specific tuning for casing and long-context coherence.

Use Cases & Limitations

This model is ideal for preprocessing text in Text-to-Speech (TTS) systems and general text pipelines where robust, conservative normalization is required. It is not a full grammar correction system and may miss some nuanced corrections or contractions. The model prioritizes safety and meaning preservation over aggressive correction, making it a reliable choice for maintaining text integrity.

Overview

CeluneNorm-0.6B-v2.0-ctx1024: Text Normalization Model

Key Capabilities & Features

Use Cases & Limitations

Full Model Card (README)