You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Note:** starting from version 0.4, installing pyannote.audio is mandatory to run the default system or to use pyannote-based models. In any other case, this step can be ignored.
@@ -105,25 +109,26 @@ See `diart.stream -h` for more options.
105
109
106
110
### From python
107
111
108
-
Run a real-time speaker diarization pipeline over an audio stream with `RealTimeInference`:
112
+
Use `RealTimeInference` to easily run a pipeline on an audio source and write the results to disk:
109
113
110
114
```python
111
115
from diart.sources import MicrophoneAudioSource
112
116
from diart.inference import RealTimeInference
113
-
from diart.pipelines import OnlineSpeakerDiarization, PipelineConfig
Diart is the official implementation of the paper *[Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation](/paper.pdf)* by [Juan Manuel Coria](https://juanmc2005.github.io/), [Hervé Bredin](https://herve.niderb.fr), [Sahar Ghannay](https://saharghannay.github.io/) and [Sophie Rosset](https://perso.limsi.fr/rosset/).
@@ -299,32 +309,34 @@ To obtain the best results, make sure to use the following hyper-parameters:
299
309
| DIHARD II | 1s | 0.619 | 0.326 | 0.997 |
300
310
| DIHARD II | 5s | 0.555 | 0.422 | 1.517 |
301
311
302
-
`diart.benchmark` and `diart.inference.Benchmark` can quickly run and evaluate the pipeline, and even measure its real-time latency. For instance, for a DIHARD III configuration:
312
+
`diart.benchmark` and `diart.inference.Benchmark` can run, evaluate and measure the real-time latency of the pipeline. For instance, for a DIHARD III configuration:
This runs a faster inference by pre-calculating model outputs in batches.
339
+
This pre-calculates model outputs in batches, so it runs a lot faster.
328
340
See `diart.benchmark -h` for more options.
329
341
330
342
For convenience and to facilitate future comparisons, we also provide the [expected outputs](/expected_outputs) of the paper implementation in RTTM format for every entry of Table 1 and Figure 5. This includes the VBx offline topline as well as our proposed online approach with latencies 500ms, 1s, 2s, 3s, 4s, and 5s.
0 commit comments