OmarioVIC/customer-email-classifier

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026License:gemmaArchitecture:Transformer Cold

OmarioVIC/customer-email-classifier is a 1 billion parameter Gemma 3 1B IT model fine-tuned for classifying customer email responses into five specific categories. Developed by OmarioVIC, this model generates structured JSON output and is optimized for low-latency deployment using vLLM. It excels at quickly categorizing short email replies for business outreach, providing a deterministic classification.

Loading preview...

OmarioVIC/customer-email-classifier: Email Response Classification

This model is a fine-tuned Gemma 3 1B IT (google/gemma-3-1b-it) specifically designed for classifying customer email responses. It processes email text and outputs a structured JSON object containing one of five predefined categories, making it ideal for automating email triage and response workflows.

Key Capabilities & Features

  • Generative Classification: Accurately categorizes email replies into automated_reply, interested, not_interested, out_of_office, or unrelated.
  • Structured Output: Always returns a JSON object in the format {"classification": "<label>"}.
  • Optimized for Production: Fine-tuned with QLoRA (4-bit) via Unsloth for efficient training and designed for low-latency inference with vLLM.
  • Deterministic Output: Configured for greedy decoding (do_sample=False, temperature=0) to ensure consistent classification results.

Use Cases & Limitations

This model is best suited for:

  • Automating the classification of short customer email replies, particularly in business outreach scenarios.
  • Integrating into systems requiring fast, programmatic categorization of email intent.

Limitations:

  • Primarily designed for short email replies (max 320 tokens including prompt).
  • Trained on a specific business outreach dataset; performance may vary on different email domains.

Training Details

The model was trained on approximately 3,500 samples using Unsloth's SFTTrainer with completion-only masking, focusing loss computation solely on the assistant's response. This approach ensures the model learns to generate the correct JSON output efficiently.