Name: google/DiarizationLM-8b-Fisher-v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: google

DiarizationLM-8b-Fisher-v2 Overview

This model is a specialized 8 billion parameter DiarizationLM, developed by Google, for speaker diarization post-processing. It is fine-tuned on the training subset of the Fisher corpus using a LoRA adapter (rank 256) and built upon the unsloth/llama-3-8b-bnb-4bit foundation model. A key distinction from its predecessor, google/DiarizationLM-8b-Fisher-v1, is that this version computes loss exclusively on completion tokens during training.

Key Capabilities & Performance

Speaker Diarization Post-Processing: Designed to refine speaker attribution in transcribed audio.
Improved WDER: Demonstrates a reduction in Word Diarization Error Rate (WDER) compared to a USM + turn-to-diarize baseline.
- Fisher Testing Set: Achieves 3.28% WDER (baseline 5.32%).
- Callhome Testing Set: Achieves 6.66% WDER (baseline 7.72%).
Training Details: Trained for approximately 9 epochs (28800 steps) on 51,063 prompt-completion pairs from the Fisher corpus, combining hyp2ora and deg2ref data flavors.
Context Length: Supports a maximal sequence length of 4096 tokens, with prompts up to 6000 characters.

Good For

Enhancing Speaker Diarization Accuracy: Ideal for applications requiring improved speaker identification and segmentation in conversational audio.
Research in Diarization Post-Processing: Useful for researchers exploring LLM-based methods for refining diarization outputs.
Integration with ASR Systems: Can be used as a post-processing step for Automatic Speech Recognition (ASR) systems to produce more accurate speaker-attributed transcripts.

Overview

DiarizationLM-8b-Fisher-v2 Overview

Key Capabilities & Performance

Good For

Full Model Card (README)