Is pyannote/diariazation pipeline very sensitive to language? #1821

ywangwxd · 2024-12-30T06:23:41Z

Tested versions

pyannote/speaker-diarization-3.1

System information

Ubunt pyannote/speaker-diarization-3.1

Issue description

How should I improve its performance on Chinese?

Minimal reproduction example (MRE)

I have tested the pipeline on a Chinese audio file. I found the diariazation results is bad, even in easy cases with a long duration of female speech and male speech. The test on an English audio file is quite good though. To reproduce it, just take an audio file which contains speechs in Chinese from multiple speakers. To make the diariazation easy, we can choose one with distinctive speakers, e.g., a male and female speakers.

hbredin added the cannot_reproduce label Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is pyannote/diariazation pipeline very sensitive to language? #1821

Is pyannote/diariazation pipeline very sensitive to language? #1821

ywangwxd commented Dec 30, 2024 •

edited

Loading

Is pyannote/diariazation pipeline very sensitive to language? #1821

Is pyannote/diariazation pipeline very sensitive to language? #1821

Comments

ywangwxd commented Dec 30, 2024 • edited Loading

Tested versions

System information

Issue description

Minimal reproduction example (MRE)

ywangwxd commented Dec 30, 2024 •

edited

Loading