The Async Audio & Async Video APIs can detect and separate unique speakers in a single stream of audio & video without need of separate speaker events.
To enable this capability with either of the APIs the
diarizationSpeakerCount query parameters need to be passed with the request.
diarizationSpeakerCount should be equal to the number of unique speakers in the conversation. If the number varies then this might introduce false positives in the diarized results.
If you’re looking for similar capability in Real-Time APIs, please refer to Active Speaker Events and Speaker Separation in WebSocket API sections.
Speaker Diarization Language Support
Currently, Speaker Diarization is available for English and Spanish languages only.
|Yes||Whether the diarization should be enabled for this conversation. Pass this as |
|Yes||The number of unique speakers in this conversation.|