Name: google/DiarizationLM-13b-Fisher-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: google

DiarizationLM-13b-Fisher-v1: Speaker Diarization Post-Processing

This model, developed by Google, is a 13 billion parameter DiarizationLM, fine-tuned on the training subset of the Fisher corpus. It leverages the unsloth/llama-2-13b-bnb-4bit foundation model and is specifically engineered for speaker diarization post-processing.

Key Capabilities

Enhanced Diarization Accuracy: Significantly improves upon baseline diarization systems by reducing the Word Diarization Error Rate (WDER) from 5.32% to 3.65% and the concatenated-Purity-aware Word Error Rate (cpWER) from 21.19% to 18.92% on the Fisher testing set.
Contextual Processing: Supports a maximal sequence length of 4096 tokens, allowing for robust contextual understanding during post-processing.
LoRA Fine-tuning: Utilizes a LoRA adapter of rank 256, with over 1 billion training parameters, trained for 12,000 steps on 48,142 prompt-completion pairs.

When to Use This Model

This model is ideal for applications where precise speaker attribution in audio transcripts is critical. It serves as an effective post-processing step to refine outputs from initial speaker diarization systems, particularly for conversational speech. Users should note that this specific version is considered outdated and the README recommends using google/DiarizationLM-8b-Fisher-v2 for newer projects.

Overview

DiarizationLM-13b-Fisher-v1: Speaker Diarization Post-Processing

Key Capabilities

When to Use This Model

Full Model Card (README)