Apply Speaker Separation to Async Files

Symbl.ai Async API enables you to process a previously recorded audio or video conversation from URLs or files. You can also process the text file of a conversation.

This page describes how to apply speaker separation to a conversation stored in audio or video files.

Speaker Separation, often called Diarization, is the ability to detect and separate unique speakers in a single stream of audio or video without the need for separate speaker events.

Speaker Diarization is available for English language only.

Authentication

Before using this API, you must generate your authentication token (AUTH_TOKEN) as described in Authentication.

Enable Diarization

To enable Speaker Separation in the Async Audio or Video API, use these query parameters in the request:

ParameterTypeDescription
enableSpeakerDiarizationBooleanEnable speaker separation for the audio or video data under consideration.
diarizationSpeakerCountIntegerSets the number of unique speakers in the audio or video data under consideration.

The example in this document uses the Async Video URL API, but Speaker Separation can be achieved with other Async Audio/Video APIs in the same way with the enableSpeakerDiarization and diarizationSpeakerCount parameters.

For then most accurate results, NUMBER_OF_UNIQUE_SPEAKERS should match the number of unique speakers in the Audio/Video data.

You must wait for the job to complete processing before you proceed with getting the Conversation Intelligence. If you immediately make a GET request to Conversation API, it is possible that you'll receive incomplete insights. Therefore, ensure that you wait for the job to complete.

The following sample code shows how to process a conversation using the Async Video URL-based API using a publicly available URL of a Video File:

Sample request

curl --location --request POST "https://api.symbl.ai/v1/process/video/
url?enableSpeakerDiarization=true&diarizationSpeakerCount=$NUMBER_OF_UNIQUE_SPEAKERS"
--header 'Content-Type: application/json'
--header "Authorization: Bearer $AUTH_TOKEN"
--data-raw '{
    "url": "https://storage.googleapis.com/demo-conversations/interview-prep.mp4"
}'
const authToken = AUTH_TOKEN;
const numberOfUniqueSpeakers = NUMBER_OF_UNIQUE_SPEAKERS;

const payload = {
  "url": "https://storage.googleapis.com/demo-conversations/interview-prep.mp4"
}

const responses = {
  400: 'Bad Request! Please refer docs for correct input fields.',
  401: 'Unauthorized. Please generate a new access token.',
  404: 'The conversation and/or it\'s metadata you asked could not be found, please check the input provided',
  429: 'Maximum number of concurrent jobs reached. Please wait for some requests to complete.',
  500: 'Something went wrong! Please contact [email protected]'
}

const fetchData = {
  method: "POST",
  headers: {
    'Authorization': `Bearer ${authToken}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(payload),
}

fetch(`https://api.symbl.ai/v1/process/video/url?enableSpeakerDiarization=true&diarizationSpeakerCount=${numberOfUniqueSpeakers}`, fetchData).then(response => {
  if (response.ok) {
    return response.json();
  } else {
    throw new Error(responses[response.status]);
  }
}).then(response => {
  console.log('response', response);
}).catch(error => {
  console.error(error);
});
import json
import requests

url = "https://api.symbl.ai/v1/process/video/url?enableSpeakerDiarization=true&diarizationSpeakerCount=" + NUMBER_OF_UNIQUE_SPEAKERS

payload = {
    "url": "https://storage.googleapis.com/demo-conversations/interview-prep.mp4"
}

# set your access token here. See https://docs.symbl.ai/docs/developer-tools/authentication
access_token = 'your_access_token'

headers = {
    'Authorization': 'Bearer ' + access_token,
    'Content-Type': 'application/json'
}

# webhookUrl = <Optional, string| your_webhook_url| Webhook url on which job updates to be sent. (This should be post API)>" e.g https://yourdomain.com/jobs/callback
# if webhookUrl is not None:
#   url += "?webhookUrl" + webhookUrl  

responses = {
    400: 'Bad Request! Please refer docs for correct input fields.',
    401: 'Unauthorized. Please generate a new access token.',
    404: 'The conversation and/or it\'s metadata you asked could not be found, please check the input provided',
    429: 'Maximum number of concurrent jobs reached. Please wait for some requests to complete.',
    500: 'Something went wrong! Please contact [email protected]'
}

response = requests.request("POST", url, headers=headers, data=json.dumps(payload), params=json.dumps(params))

if response.status_code == 201:
    # Successful API execution
    print("conversationId => " + response.json()['conversationId'])  # ID to be used with Conversation API.
    print("jobId => " + response.json()['jobId'])  # ID to be used with Job API.
elif response.status_code in responses.keys():
    print(responses[response.status_code])  # Expected error occurred
else:
    print("Unexpected error occurred. Please contact [email protected]" + ", Debug Message => " + str(response.text))

exit()

Response

{
    "conversationId": "4601416062599168",
    "jobId": "e33d764c-c663-488f-8581-d7182ad0d7a0"
}

Get Speaker Separated results

Now that you have a conversationId from the previous response you can use the Retrieve messages from Conversations request to get speaker-separated results.

Messages request

curl --request GET \
     --url https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/messages \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer <AUTH_TOKEN>'
const options = {
  method: 'GET',
  headers: {
    Accept: 'application/json',
    Authorization: 'Bearer <AUTH_TOKEN>'
  }
};

fetch('https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/messages', options)
  .then(response => response.json())
  .then(response => console.log(response))
  .catch(err => console.error(err));
import requests

url = "https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/messages"

headers = {
    "Accept": "application/json",
    "Authorization": "Bearer <AUTH_TOKEN>"
}

response = requests.get(url, headers=headers)

print(response.text)

Messages response

{
    "messages": [
        {
            "id": "4591723946704896",
            "text": "You're hired two words, everybody loves to hear.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            },
            "startTime": "2020-08-04T07:18:17.573Z",
            "endTime": "2020-08-04T07:18:21.573Z",
            "conversationId": "5105430690791424"
        },
        {
            "id": "6328236401229824",
            "text": "But before we hear these words comes the interview today's video is part one in a series.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            },
            "startTime": "2020-08-04T07:18:21.973Z",
            "endTime": "2020-08-04T07:18:30.473Z",
            "conversationId": "5105430690791424"
        },
    ]
}

The previous sample shows the speaker in the from object with a unique ID. These are the uniquely identified members of this conversation.

The speaker number in this sample is arbitrary and the number doesn’t necessarily reflect the order in which someone spoke.

Identify Unique Speakers

You can then use the Retrieve members from Conversations request to get the uniquely identified speakers for the conversation when Speaker Diarization is enabled.

Members request

curl --request GET \
     --url https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/members \
     --header 'Accept: application/json' \
     --header 'Authorization: Bearer <AUTH_TOKEN>'
const options = {
  method: 'GET',
  headers: {
    Accept: 'application/json',
    Authorization: 'Bearer <AUTH_TOKEN>'
  }
};

fetch('https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/members', options)
  .then(response => response.json())
  .then(response => console.log(response))
  .catch(err => console.error(err));
import requests

url = "https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/members"

headers = {
    "Accept": "application/json",
    "Authorization": "Bearer <AUTH_TOKEN"
}

response = requests.get(url, headers=headers)

print(response.text)

Members response

{
    "members": [
        {
            "id": "9d6d34d9-5019-4694-9c9a-8ba7bfc8cfab",
            "name": "Speaker 1"
        },
        {
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "name": "Speaker 2"
        }
    ]
}

The name assigned to a uniquely identified speaker/member from a separated audio/video will follow the format Speaker <number> where <number> is arbitrary and does not necessarily reflect in what order someone spoke.

The id can be used to identify a speaker/member for that specific conversation and can be used to update the details for the specific member demonstrated below in the Updating Detected Members section.

Updating the Detected Members

The detected members (unique speakers) would have names like Speaker 1 as the automatic speaker recognition wouldn’t have any context to who this speaker is (name or other details of the speaker). Therefore, it is important to update the details of the detected speakers after the Job is marked as complete.

GET members

The members call in the Conversation API returns the uniquely identified speakers as shown in the Identifying Unique Speakers section above when the Speaker Separation is enabled.

Let’s consider the same set of members that can be retrieved by calling the GET members call in the Conversation API.

👉 Retrieve Members from Conversations

JSON Response Example

{
    "members": [
        {
            "id": "9d6d34d9-5019-4694-9c9a-8ba7bfc8cfab",
            "name": "Speaker 1"
        },
        {
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "name": "Speaker 2"
        }
    ]
}

PUT members

We can now use the PUT members call to update the details of a specific member as shown below. This call would update the Speaker 2 as shown in the above section with the values in the cURL’s request-body:

👉 PUT Members Information

$ curl --location --request PUT "https://api.symbl.ai/v1/conversations/$CONVERSATION_ID/members/2f69f1c8-bf0a-48ef-b47f-95ae5a4de325"
       --header 'Content-Type: application/json'
       --header "Authorization: Bearer $AUTH_TOKEN"
       --data-raw '{
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "email": "[email protected]",
            "name": "John Doe"
        }'
const authToken = AUTH_TOKEN;
const conversationId = 'your_conversation_id'  // Generated using Submit text end point
const memberId = 'your_member_id'  // MemberId of members fetched using fetchMember API
const url = `https://api.symbl.ai/v1/conversations/${conversationId}/members/${memberId}`;

payload = {
    'id': "UUID_to_be_updated",  // Should be a valid UUID e.g. f170371e-d9db-4d55-9d49-a111a89cf078
    'email': "email_id_to_be_updated",  // Should be a valid emailId e.g. [email protected]
    'name': "name_to_be_updated"  // Should be a valid string e.g. John
}

const responses = {
  400: 'Bad Request! Please refer docs for correct input fields.',
  401: 'Unauthorized. Please generate a new access token.',
  404: 'The conversation and/or it\'s metadata you asked could not be found, please check the input provided',
  429: 'Maximum number of concurrent jobs reached. Please wait for some requests to complete.',
  500: 'Something went wrong! Please contact [email protected]'
}

const fetchData = {
  method: "PUT",
  headers: {
    'Authorization': `Bearer ${authToken}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(payload),
}

fetch(url, fetchData).then(response => {
  if (response.ok) {
    return response.json();
  } else {
    throw new Error(responses[response.status]);
  }
}).then(response => {
  console.log('response', response);
}).catch(error => {
  console.error(error);
});
import json
import requests

baseUrl = "https://api.symbl.ai/v1/conversations/{conversationId}/members/{memberId}"
conversationId = 'your_conversation_id'  # Generated using Submit text end point
memberId = 'your_member_id'  # MemberId of members fetched using fetchMember API

url = baseUrl.format(conversationId=conversationId, memberId=memberId)

# set your access token here. See https://docs.symbl.ai/docs/developer-tools/authentication
access_token = 'your_access_token'

headers = {
    'Authorization': 'Bearer ' + access_token,
    'Content-Type': 'application/json'
}

payload = {
    'id': "UUID_to_be_updated",  # Should be a valid UUID e.g. f170371e-d9db-4d55-9d49-a111a89cf078
    'email': "email_id_to_be_updated",  # Should be a valid emailId e.g. [email protected]
    'name': "name_to_be_updated"  # Should be a valid string e.g. John
}

responses = {
    401: 'Unauthorized. Please generate a new access token.',
    404: 'The conversation and/or it\'s metadata you asked could not be found, please check the input provided',
    500: 'Something went wrong! Please contact [email protected]'
}

response = requests.request("PUT", url, headers=headers, data=json.dumps(payload))

if response.status_code == 200:
    # Successful API execution
    print(response.json()['message'])  # message containing status of response
elif response.status_code in responses.keys():
    print(responses[response.status_code])  # Expected error occurred
else:
    print("Unexpected error occurred. Please contact [email protected]" + ", Debug Message => " + str(response.text))

exit()
  • The CONVERSATION_ID needs to be replaced with the actual Conversation ID (conversationId)

  • The AUTH_TOKEN needs to be replaced with the Bearer token generated during our authentication process.

The URL has the id of the member we want to append to PUT /members with the request body containing the updated name of this member.

There is also the option to include the email of the member. The email will be used as an identifier for tracking those specific members uniquely in that conversation. (Refer to the Updating the Detected Members section below for more details)

After the above call is successful, we will receive the following response:

{
    "message": "Member with id: 2f69f1c8-bf0a-48ef-b47f-95ae5a4de325 for conversationId: <CONVERSATION_ID> updated successfully! The update should be reflected in all messages and insights along with this conversation"
}

The message is self-explanatory and tells us that all the references to the member with the id of 2f69f1c8-bf0a-48ef-b47f-95ae5a4de325 in the conversation should now reflect the new values we updated this member with. That includes insights, messages and the conversation’s members as well.

So if we call the GET /members API now, we would see the following result:

{
    "members": [
        {
            "id": "9d6d34d9-5019-4694-9c9a-8ba7bfc8cfab",
            "name": "Speaker 1"
        },
        {
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "email": "[email protected]",
            "name": "John Doe"
        }
    ]
}

And similarly, with the GET /messages API call, we would see the updates reflected below as well:

{
    "messages": [
        {
            "id": "4591723946704896",
            "text": "You're hired two words, everybody loves to hear.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "[email protected]",
                "name": "John Doe"
            },
            "startTime": "2020-08-04T07:18:17.573Z",
            "endTime": "2020-08-04T07:18:21.573Z",
            "conversationId": "5105430690791424"
        },
        {
            "id": "6328236401229824",
            "text": "But before we hear these words comes the interview today's video is part one in a series.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "[email protected]",
                "name": "John Doe"
            },
            "startTime": "2020-08-04T07:18:21.973Z",
            "endTime": "2020-08-04T07:18:30.473Z",
            "conversationId": "5105430690791424"
        },

    ]
}

Curious about the GET /insights API? It would reflect these updates as well!

{
    "insights": [
        {
            "id": "5501181057040384",
            "text": "We need to go over three more common interview questions.",
            "type": "action_item",
            "score": 1,
            "messageIds": [
                "5710067261243392"
            ],
            "entities": [],
            "phrases": [
                {
                    "type": "action_phrase",
                    "text": "go over three more common interview questions"
                }
            ],
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "[email protected]",
                "name": "John Doe"
            },
            "definitive": true,
            "assignee": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            }
        },
        {
            "id": "5519156904460288",
            "text": "How did you hear about this position?",
            "type": "question",
            "score": 0.999988666660899,
            "messageIds": [
                "4616389407014912"
            ],
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "[email protected]",
                "name": "John Doe"
            }
        },

    ]
}


Did this page help you?