NAV
JavaScript cURL

Introduction

Base URL

https://api.symbl.ai/v1

Symbl is a comprehensive suite of APIs for analyzing natural human conversations - both for your team’s internal conversations and of course the conversations you are having with your customers. Built on our Contextual Conversation Intelligence (C2I) technology, the APIs enable you to rapidly incorporate human-level understanding that goes beyond simple natural language processing of voice and text conversations.

Get real-time analysis of free-flowing discussions like meetings, sales calls, support conversations, emails, chat, social conversations, transcripts, etc to automatically surface highly relevant summary topics, contextual insights, suggestive action items, follow-ups, decisions and questions - without the use of upfront training data, wake words or custom classifiers.

Getting Started

Postman

Here a quick Postman link for people who love to get their hands dirty.

Run in Postman

Ready to test? When you've built out your integration, review our testing tips to ensure that your integration works end to end

Explore Sample Apps

Use any of the sample integrations as a starting point to extend your own custom applications

If you're interested in what all you can do with Symbl, check out our sample applications on GitHub that demonstrate how Symbl can be used to connect to Twilio Media Streams, Salesforce Dashboard, Outlook Calendar and many more.


Symbl for Twilio Flex

Twilio Flex as an inbound adapter for streaming audio to through Symbl's Websocket API.

Symbl for Zoom

Our app that lets you invite Symbl to your zoom meeting by just pasting in the meeting invite.

Explore more

Browse our demo library and look at sample code on how to integrate voice intelligence into existing applications.

Authentication

If you don't already have your app id or app secret, log in to the platform to get your credentials.

To invoke any API call, you must have a valid Access Token generated using the valid application credentials.

To generate the token using the appId and appSecret, the HTTP POST request needs to be made with these details.

POST https://api.symbl.ai/oauth2/token:generate
{
  "type": "application",
  "appId": "your_appId",
  "appSecret": "your_appSecret"
}


curl -k -X POST "https://api.symbl.ai/oauth2/token:generate" \
     -H "accept: application/json" \
     -H "Content-Type: application/json" \
     -d "{ \"type\": \"application\", \"appId\": \"<appId>\", \"appSecret\": \"<appSecret>\"}"
 const request = require('request');

 const authOptions = {
   method: 'post',
   url: "https://api.symbl.ai/oauth2/token:generate",
   body: {
       type: "application",
       appId: "<appId>",
       appSecret: "<appSecret>"
   },
   json: true
 };

 request(authOptions, (err, res, body) => {
   if (err) {
     console.error('error posting json: ', err);
     throw err
   }

   console.log(JSON.stringify(body, null, 2));
 });

JavaScript code to generate the Access Token. The code should work with NodeJS 7+ and browsers. You will need to install request for this sample code.

$ npm i request

For a valid appId and appSecret combination, the success response will be returned like this.

 {
   "accessToken": "your_accessToken",
   "expiresIn": 3600
 }


For any invalid appId and appSecret combination, HTTP 401 Unauthorized response code will be returned.

How Tos

We know the fastest deterrent to adopting new technology is friction. We’ve created a series of step by step ‘How tos’ to get you to value quickly and easily.

Get Live Transcription - Phone Call (Node.js) Telephony

Get the live transcription in your Node.js application by making a call to a valid phone number.

Making a phone call is also the quickest way to test Symbl’s Telephony API. It can make an outbound call to a phone number using a traditional public switched telephony network (PSTN), any SIP trunks, or SIP endpoints that can be accessed over the internet using a SIP URI.

Set up Symbl

To get started, you’ll need your account credentials and Node.js installed (> v8.x) on your machine.

We’ll use the Symbl module for Node.js in this guide. Make sure you have a Node project set up. If you don’t have one, you can set one up using npm init.

From the root directory of your project, run the following command to add symbl-node in your project dependencies.

$ npm i --save symbl-node


Retrieve your credentials

Your credentials include your appId and appSecret. You can find them on the home page of the platform. credentials page

Initialize SDK

const {sdk, SpeakerEvent} = require("symbl-node");

sdk.init({
    // Your appId and appSecret https://platform.symbl.ai
    appId: 'your_appId',
    appSecret: 'your_appSecret'
}).then(async () => {
    console.log('SDK initialized.');
    try {
      // You code goes here.
    } catch (e) {
        console.error(e);
    }
}).catch(err => console.error('Error in SDK initialization.', err));

In a new file called index.js in the project, add the following lines to import and initialize the SDK. Replace the appId and appSecret in the code with yours.

Make a phone call

The quickest way to test the Telephony API is to make a phone call to any valid phone number. The Telephony API only works with phone numbers in the U.S. and Canada.

tutorial phone integration

const connection = await sdk.startEndpoint({
    endpoint: {
      type: 'pstn', // when making a regular phone call
      // Replace this with a real phone number
      phoneNumber: '1XXXXXXXXXX' // include country code, example - 19998887777
    }
  });
const {connectionId} = connection;
console.log('Successfully connected. Connection Id: ', connectionId);

To make a phone call, call the startEndpoint with type set to ‘pstn’ and a valid U.S./Canadian phone number phoneNumber.

Subscribe to the Live Results

To get the live transcription, you need to subscribe to the connection to get the results. You need to call the subscribeToConnection method in the SDK and pass the connectionId and a callback method which will be called on for every new event including the (live) transcription.

// Subscribe to connection using connectionId.
sdk.subscribeToConnection(connectionId, (data) => {
  const {type} = data;
  if (type === 'transcript_response') {
      const {payload} = data;

      // You get live transcription here!!
      process.stdout.write('Live: ' + payload && payload.content + '\r');

  } else if (type === 'message_response') {
      const {messages} = data;

      // You get processed messages in the transcript here!!! Real-time but not live! :)
      messages.forEach(message => {
          process.stdout.write('Message: ' + message.payload.content + '\n');
      });
  } else if (type === 'insight_response') {
      const {insights} = data;
      // You get any insights here!!!
      insights.forEach(insight => {
          process.stdout.write(`Insight: ${insight.type} - ${insight.text} \n\n`);
      });
  }
});

When you use this API, you can use a callback function within the subscribeToConnection function. You pass an object and declare the type of response you want in the type field.

End the Call

To end the call, you should make a stopEndpoint call. The following code stops the call after 60 seconds. Your business logic should determine when the call should end.

// Stop call after 60 seconds to automatically.
setTimeout(async () => {
  const connection = await sdk.stopEndpoint({ connectionId });
  console.log('Stopped the connection');
  console.log('Conversation ID:', connection.conversationId);
}, 60000); // Change the 60000 with higher value if you want this to continue for more time.

The stopEndpoint will return an updated connection object which will have the conversationId in the response. You can use conversationId to fetch the results even after the call using the Conversation API.

Code

You can find the complete code used in this guide here.

Test

To verify and check if the code is working:

  1. Run your code:
$ node index.js


  1. You should receive a phone call to the number you used in the startEndpoint call. Accept the call.

  2. Start speaking in English (default language) and you should see the live transcription added to the console in real-time.

  3. The call should automatically end after 60 seconds. If you end it sooner and don’t invoke stopEndpoint, you will not receive the conversationId. If you need to access the results generated in the call, you should invoke stopEndpoint even if it was ended without explicitly invoking stopEndpoint prior to this point.

Note

The example above invokes stopEndpoint after a fixed timeout of 60 seconds. This is for demonstrating the stop functionality and it is not the recommended method of implementation for your application. In a real implementation, you should invoke startEndpoint and stopEndpoint as needed by the business logic of your application, i.e when you would like Symbl to start processing and stop processing.

See also

Congratulations! You finished your integration with Symbl’s Telephony API using PSTN.

Get Speaker Separated Transcripts - Diarization with Async API

Enable the Speaker Diarization (Speaker Separation) for the Async Audio or Async Video APIs to get speaker separated transcripts and insights.

Enabling the Diarization

Enabling Speaker Separation in the Async Audio/Video API is as simple as adding the enableSpeakerDiarization=true and diarizationSpeakerCount=<NUMBER_OF_UNIQUE_SPEAKERS> query-parameters below:

$ curl --location --request POST 'https://api.symbl.ai/v1/process/video/url?enableSpeakerDiarization=true&diarizationSpeakerCount=2&webhookUrl=<WEBHOOK_URL>' --header 'Content-Type: application/json' --header 'x-api-key: <X-API-KEY> --data-raw '{
    "url": "https://storage.googleapis.com/demo-conversations/interview-prep.mp4"
}'


The above snippet shows a cURL command for consuming the Async Video URL based API which takes in the url for a publicly available URL of a Video File.

The above URL has two query-parameters:

  1. enableSpeakerDiarization=true which will enable the speaker separation for the Audio/Video data under consideration.

  2. diarizationSpeakerCount=2 which sets the number of unique speakers in the Audio/Video data under consideration.

Getting the uniquely identified speakers (Members)

Invoking the members call in the Conversation API will return the uniquely identified speakers for this conversation when Speaker Diarization is enabled. View a sample output below:

{
    "members": [
        {
            "id": "9d6d34d9-5019-4694-9c9a-8ba7bfc8cfab",
            "name": "Speaker 1"
        },
        {
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "name": "Speaker 2"
        }
    ]
}


The name assigned to a uniquely identified speaker/member from a Diarized Audio/Video will follow the format Speaker <number> where <number> is arbitrary and does not necessarily reflect in what order someone spoke.

The id can be used to identify a speaker/member for that specific conversation and can be used to update the details for the specific member demonstrated below in the Updating Detected Members section.

Getting the Speaker Separated Results

Invoking the messages call in the Conversation API would return the speaker separated results. View a snippet for the above URL below:

{
    "messages": [
        {
            "id": "4591723946704896",
            "text": "You're hired two words, everybody loves to hear.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            },
            "startTime": "2020-08-04T07:18:17.573Z",
            "endTime": "2020-08-04T07:18:21.573Z",
            "conversationId": "5105430690791424"
        },
        {
            "id": "6328236401229824",
            "text": "But before we hear these words comes the interview today's video is part one in a series.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            },
            "startTime": "2020-08-04T07:18:21.973Z",
            "endTime": "2020-08-04T07:18:30.473Z",
            "conversationId": "5105430690791424"
        },
        ...
    ]
}


The above snippet shows the speaker in the from object with a unique-id. These are the uniquely identified members of this conversation.

Similarly, invoking the insights call in the Conversation API would also reflect the identified speakers in the detected insights. The response below demonstrates this:

{
    "insights": [
        {
            "id": "5501181057040384",
            "text": "We need to go over three more common interview questions.",
            "type": "action_item",
            "score": 1,
            "messageIds": [
                "5710067261243392"
            ],
            "entities": [],
            "phrases": [
                {
                    "type": "action_phrase",
                    "text": "go over three more common interview questions"
                }
            ],
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            },
            "definitive": true,
            "assignee": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            }
        },
        {
            "id": "5519156904460288",
            "text": "How did you hear about this position?",
            "type": "question",
            "score": 0.999988666660899,
            "messageIds": [
                "4616389407014912"
            ],
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            }
        },
        ...
    ]
}


Updating the Detected Members

The detected members (unique speakers) would have names like Speaker 1 as the ASR wouldn’t have any context to who this speaker is (name or other details of the speaker). Therefore, it is important to update the details of the detected speakers after the Job is complete.

The members call in the Conversation API returns the uniquely identified speakers as shown in the Getting the uniquely identified speakers (Members) section above when the Diarization is enabled.

Let’s consider the same set of members that can be retrieved by calling the GET members call in the Conversation API.

{
    "members": [
        {
            "id": "9d6d34d9-5019-4694-9c9a-8ba7bfc8cfab",
            "name": "Speaker 1"
        },
        {
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "name": "Speaker 2"
        }
    ]
}


We can now use the PUT members call to update the details of a specific member as shown below. This call would update the Speaker 2 as shown in the above section with the values in the cURL’s request-body

$ curl --location --request PUT 'https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/members/2f69f1c8-bf0a-48ef-b47f-95ae5a4de325' --header 'Content-Type: application/json' --header 'x-api-key: <X-API-KEY> --data-raw '{
    "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
    "email": "john@example.com",
    "name": "John Doe"
}'


The URL in line 1 above has the id of the member we want to append to /members with the request body containing the updated nameof this member.

There is also the option to include the email of the member. The email will be used as an identifier for tracking those specific members uniquely in that conversation. (Refer the Appending to an existing conversation with Diarization section below for more details)

After the above call is successful, we will receive the following response:

{
    "message": "Member with id: 2f69f1c8-bf0a-48ef-b47f-95ae5a4de325 for conversationId: <CONVERSATION_ID> updated successfully! The update should be reflected in all messages and insights along with this conversation"
}


The message is self-explanatory and tells us that all the references to the member with the id of 2f69f1c8-bf0a-48ef-b47f-95ae5a4de325 in the conversation should now reflect the new values we updated this member with. That includes insights, messages and the conversation’s members as well.

So if we call the members API now, we would see the following result:

{
    "members": [
        {
            "id": "9d6d34d9-5019-4694-9c9a-8ba7bfc8cfab",
            "name": "Speaker 1"
        },
        {
            "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
            "email": "john@example.com",
            "name": "John Doe"
        }
    ]
}


And similarly, with the messages API call, we would see the updates reflected below as well:

{
    "messages": [
        {
            "id": "4591723946704896",
            "text": "You're hired two words, everybody loves to hear.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "john@example.com",
                "name": "John Doe"
            },
            "startTime": "2020-08-04T07:18:17.573Z",
            "endTime": "2020-08-04T07:18:21.573Z",
            "conversationId": "5105430690791424"
        },
        {
            "id": "6328236401229824",
            "text": "But before we hear these words comes the interview today's video is part one in a series.",
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "john@example.com",
                "name": "John Doe"
            },
            "startTime": "2020-08-04T07:18:21.973Z",
            "endTime": "2020-08-04T07:18:30.473Z",
            "conversationId": "5105430690791424"
        },
        ...
    ]
}


Curious about the insights API? It would reflect these updates as well!

{
    "insights": [
        {
            "id": "5501181057040384",
            "text": "We need to go over three more common interview questions.",
            "type": "action_item",
            "score": 1,
            "messageIds": [
                "5710067261243392"
            ],
            "entities": [],
            "phrases": [
                {
                    "type": "action_phrase",
                    "text": "go over three more common interview questions"
                }
            ],
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "john@example.com",
                "name": "John Doe"
            },
            "definitive": true,
            "assignee": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "name": "Speaker 2"
            }
        },
        {
            "id": "5519156904460288",
            "text": "How did you hear about this position?",
            "type": "question",
            "score": 0.999988666660899,
            "messageIds": [
                "4616389407014912"
            ],
            "from": {
                "id": "2f69f1c8-bf0a-48ef-b47f-95ae5a4de325",
                "email": "john@example.com",
                "name": "John Doe"
            }
        },
        ...
    ]
}


Appending to an existing conversation with Diarization

Because conversations don’t neatly end at once and may resume later, our Async API allows you to update/append an existing conversation. You can read more about this capability here.

To enable Diarization with the append capability, the request structure is the same as shown above for creating a new Conversation. You would need to pass in enableSpeakerDiarization=true and diarizationSpeakerCount=<NUMBER_OF_UNIQUE_SPEAKERS> query-parameters.

However, there is one caveat in how the ASR works with Diarization. Consider the below:

An example scenario

We send a recorded conversation to the Async API with 2 speakers John and Alice with enableSpeakerDiarization=true. The diarization identifies them as Speaker 1 and Speaker 2 respectively. We then update the above speakers with their email as john@example.com and alice@example.com

Now we use the append call for appending another conversation with 2 speakers John and May with enableSpeakerDiarization=true. Let’s assume that the diarization would now identify these as Speaker 1 and Speaker 2 respectively. As discussed before, these numbers are arbitrary and have nothing to do with the order in which the speakers spoke in the conversation.

After this job is complete we will have 4 members in this conversation:

  1. John

  2. Alice

  3. Speaker 1 (Which is John again)

  4. Speaker 2 (Which is May)

Since John and Speaker 1 refer to the same speaker but are labeled as different speakers, their member references would be different for all messages and insights that they are a part of.

The email identifier

This is where the email identifier comes in. The PUT members call can uniquely identify and merge a member with the same email parameter and eliminate any duplicate references with a single reference across the entire conversation which would update all the references including the members, messages and insights.

If we were to execute a PUT members call with the below body where 74001a1d-4e9e-456a-84ed-81bbd363333a refers to the id of Speaker 1 from the above scenario, this would eliminate this member and would update all the references with member represented by 2f69f1c8-bf0a-48ef-b47f-95ae5a4de325 which we know is John Doe.

$ curl --location --request PUT 'https://api.symbl.ai/v1/conversations/<CONVERSATION_ID>/members/74001a1d-4e9e-456a-84ed-81bbd363333a' --header 'Content-Type: application/json' --header 'x-api-key: <X-API-KEY> --data-raw '{
    "id": "74001a1d-4e9e-456a-84ed-81bbd363333a",
    "email": "john@example.com",
    "name": "John Doe"
}'


This is possible because the email uniquely identifies that user.

Best Practices

That’s it!

You have the steps to start separating the speakers in any Audio/Video-based conversation.

You can visit docs.symbl.ai to learn more about the following APIs:

  1. Async API

  2. Conversation API

Symbl SDK (Node.js)

The Programmable Voice SDK allows you to add Conversational Intelligence directly into your web applications and meeting platforms. With the Voice SDK, you can generate intelligent insights such as action items, topics and questions.

This section demonstrates how to use the SDK to add voice integration to your existing application.


Initialize the SDK

After installing, initialize the SDK

 sdk.init({
    appId: 'yourAppId',
    appSecret: 'yourAppSecret',
    basePath: 'https://api.symbl.ai'
  })
  .then(() => console.log('SDK Initialized.'))
  .catch(err => console.error('Error in initialization.', err));

1. After getting your appId and appSecret, use the command below to install the SDK and add it to your npm project's package.json.

$ npm install --save symbl-node


2. Reference the SDK in either the ES5 or ES6 way.

The ES5 way

var sdk = require('symbl-node').sdk;

The ES6 way

import { sdk } from 'symbl-node';

Connect to Endpoints

The code snippet below dials in using PSTN and hangs up after 60 seconds.

const {
  sdk
} = require('symbl-node');

sdk.init({
  appId: 'yourAppId',
  appSecret: 'yourAppSecret',
  basePath: 'https://api.symbl.ai'
}).then(() => {
  sdk.startEndpoint({
    endpoint: {
      type: 'pstn', // This can be pstn or sip
      phoneNumber: '<number_to_call>', // include country code
      dtmf: '<meeting_id>' // if password protected, use "dtmf": "<meeting_id>#,#<password>#"
    }
  }).then(connection => {
    console.log('Successfully connected.', connection.connectionId);

    // Scheduling stop endpoint call after 60 seconds for demonstration purposes
    // In real adoption, sdk.stopEndpoint() should be called when the meeting or call actually ends

    setTimeout(() => {
      sdk.stopEndpoint({
        connectionId: connection.connectionId
      }).then(() => {
        console.log('Stopped the connection');
        console.log('Summary Info:', connection.summaryInfo);
        console.log('Conversation ID:', connection.conversationId);
      }).catch(err => console.error('Error while stopping the connection', err));
    }, 60000);
  }).catch(err => console.error('Error while starting the connection', err));

}).catch(err => console.error('Error in SDK initialization.', err));

We recommend using SIP whenever possible instead of PSTN as it provides higher audio quality options as compared to PSTN. SIP endpoint provides an optional audio configuration as well. Contact us for your specific requirements.

This SDK supports dialing through a simple phone number - PSTN or a Voice Over IP system - SIP endpoint. If you don't have your own voice over IP system, you will want to use a phone number to make the connection.

PSTN (Public Switched Telephone Networks)

The Publicly Switched Telephone Network (PSTN) is the network that carries your calls when you dial in from a landline or cell phone. It refers to the worldwide network of voice-carrying telephone infrastructure, including privately-owned and government-owned infrastructure.

endpoint: {
  type: 'pstn',
  phoneNumber: '14083380682', // Phone number to dial in
  dtmf: '6155774313#' // Joining code
}

SIP (Session Initiation Protocol)

Session Initiation Protocol (SIP) is a standardized communications protocol that has been widely adopted for managing multimedia communication sessions for voice and video calls. SIP may be used to establish connectivity between your communications infrastructures and Symbl's communications platform.

endpoint: {
  type: 'sip',
  uri: 'sip:555@your_sip_domain', // SIP URI to dial in
  audioConfig: { // Optionally any audio configuration
    sampleRate: 16000,
    encoding: 'PCMU',
    sampleSize: '16'
  }
}

Active Speaker Events

The example on the right shows how to connect to a PSTN endpoint, create a speakerEvent instance and push events on connection

const {
  sdk,
  SpeakerEvent
} = require('symbl-node');

sdk.init({
  appId: 'yourAppId',
  appSecret: 'yourAppSecret',
  basePath: 'https://api.symbl.ai'
}).then(() => {
  sdk.startEndpoint({
    endpoint: {
      type: 'pstn',
      phoneNumber: '<number_to_call>', // include country code
      dtmf: '<meeting_id>' // if password protected, use "dtmf": "<meeting_id>#,#<password>#"
    }
  }).then(connection => {
    const connectionId = connection.connectionId;
    console.log('Successfully connected.', connectionId);

    const speakerEvent = new SpeakerEvent();
    speakerEvent.type = SpeakerEvent.types.startedSpeaking;
    speakerEvent.user = {
      userId: 'john@example.com',
      name: 'John'
    };
    speakerEvent.timestamp = new Date().toISOString();

    sdk.pushEventOnConnection(
      connectionId,
      speakerEvent.toJSON(),
      (err) => {
        if (err) {
          console.error('Error during push event.', err);
        } else {
          console.log('Event pushed!');
        }
      }
    );

    // Scheduling stop endpoint call after 60 seconds for demonstration purposes
    // In real adoption, sdk.stopEndpoint() should be called when the meeting or call actually ends

    setTimeout(() => {
      sdk.stopEndpoint({
        connectionId: connection.connectionId,
      }).then(() => {
        console.log('Stopped the connection');
        console.log('Summary Info:', connection.summaryInfo);
        console.log('Conversation ID:', connection.conversationId);
      }).catch(err => console.error('Error while stopping the connection.', err));
    }, 60000);
  }).catch(err => console.error('Error while starting the connection', err));

}).catch(err => console.error('Error in SDK initialization.', err));

NOTE: Setting the timestamp for speakerEvent is optional but it's recommended to provide accurate timestamps in the events when they occurred to get more precision.

Events can be pushed to an on-going connection to have them processed. The code snippet to the right shows a simple example.

Every event must have a type to define the purpose of the event at a more granular level, usually to indicate different activities associated with the event resource. For example - A "speaker" event can have type as started_speaking. An event may have additional fields specific to the event.

Currently, Symbl only supports the speaker event which is described below.

Speaker Event

The speaker event is associated with different individual attendees in the meeting or session. An example of a speaker event is shown below.

In the code example the user needs to have userId field to uniquely identify the user.

Speaker Event has the following types:

started_speaking

This event contains the details of the user who started speaking with the timestamp in ISO 8601 format when he started speaking.

const speakerEvent = new SpeakerEvent({
  type: SpeakerEvent.types.startedSpeaking,
  timestamp: new Date().toISOString(),
  user: {
    userId: 'john@example.com',
    name: 'John'
  }
});

stopped_speaking

This event contains the details of the user who stopped speaking with the timestamp in ISO 8601 format when he stopped speaking.

const speakerEvent = new SpeakerEvent({
  type: SpeakerEvent.types.stoppedSpeaking,
  timestamp: new Date().toISOString(),
  user: {
    userId: 'john@example.com',
    name: 'John'
  }
});


As shown in the above examples, it's ok to reuse the same speakerEvent instance per user, by changing the event's type to optimize by reducing the number of instances for SpeakerEvent.

A startedSpeaking event is pushed on the on-going connection. You can use pushEventOnConnection() method from the SDK to push the events.

Subscribe to Real Time Events

Voice SDK also lets you subscribe to real-time events when you connect to one of the Endpoints specified in the above sections. These include

  1. Real-Time Transcription
  2. Real-Time Insights
  3. Real-Time Messages
  4. Real-Time Intents

The below example shows how to achieve this.

Initialize SDK
const {sdk, SpeakerEvent} = require("symbl-node");

sdk.init({
    // Your appId and appSecret https://platform.symbl.ai
    appId: 'your_appId',
    appSecret: 'your_appSecret'
}).then(async () => {
    console.log('SDK initialized.');
    try {
      // You code goes here.
    } catch (e) {
        console.error(e);
    }
}).catch(err => console.error('Error in SDK initialization.', err));


Add the above lines to import and initialize the SDK. Replace the appId and appSecret in the code. You can find the them by signing up on Symbl Developer Platform

Make a phone call
const connection = await sdk.startEndpoint({
    endpoint: {
        type: 'pstn', // when making a regular phone call
        // Replace this with a real phone number
        phoneNumber: '1XXXXXXXXXX' // include country code
    }
});
const {connectionId} = connection;
console.log('Successfully connected. Connection Id: ', connectionId);


The above snippet makes a phone call, by calling the startEndpoint with type set to pstn and a valid US/Canada Phone Number phoneNumber. You can also call in via type sip as well and the steps below will still remain the same.

Subscribe to the Live Events
// Subscribe to connection using connectionId.
sdk.subscribeToConnection(connectionId, (data) => {
  const {type} = data;
  if (type === 'transcript_response') {
      const {payload} = data;

      // You get live transcription here!!
      process.stdout.write('Live: ' + payload && payload.content + '\r');

  } else if (type === 'message_response') {
      const {messages} = data;

      // You get processed messages in the transcript here!!! Real-time but not live! :)
      messages.forEach(message => {
          process.stdout.write('Message: ' + message.payload.content + '\n');
      });
  } else if (type === 'insight_response') {
      const {insights} = data;
      // See <link here> for more details on Insights
      // You get any insights here!!!
      insights.forEach(insight => {
          process.stdout.write(`Insight: ${insight.type} - ${insight.text} \n\n`);
      });
  }
});


The above snippet calls the subscribeToConnection which requires the connectionId of the call and a callback function to be passed as the second argument which will be invoked when any of the above events are available to be consumed. The data received will contain type of the event. It can be one of transcript_response, message_response, insight_response. Lets go over them one by one.

  1. transcript_response : This contains the real-time transcription data which is availabe as soon as its detected.

  2. message_response : This will contain the array of the transcripts of all the speakers which will be logically separated by punctuations or the speakers if Active Speaker Events are pushed.

  3. insight_response : This will contain the array of all the insights detected in real-time. These can be action-items or questions.

There is also a 4th type of event which is intent_response covered in a separate example.

End the call
// Stop call after 60 seconds to automatically.
setTimeout(async () => {
  const connection = await sdk.stopEndpoint({ connectionId });
  console.log('Stopped the connection');
  console.log('Conversation ID:', connection.conversationId);
}, 60000); // Change the 60000 with higher value if you want this to continue for more time.


To end the call gracefully, we call the stopEndpoint call to stop the call. The code snippet above simply stops the call after 60 seconds.

And we're done! That's how you can consume real-time events using Voice SDK!

The complete code for the example above can be found here

Language and TimeZone

You can also specify languages other than 'English' from the ones supported to be used for calls made via 'PSTN' or 'SIP'. With this since the call can take place in a different timeZone you can also pass in the timeZone which will be used to render the Summary UI specific to that language and timeZone. The below sub-sections specify how to use these capabilities with Voice SDK with a complete example.

Utilising other languages

Voice SDK allows you to work with audio from multiple different languages. The currently supported languages are given below with the regions they are spoken in:

  1. English (United States)
  2. English (United Kingdom)
  3. English (Australia)
  4. French (Canada)
  5. German (Germany)
  6. Italic (Italy)
  7. Dutch (Netherlands)
  8. Japanese (Japan)
  9. Spanish (Latin America)
  10. French (France)

You can pass in the languages array in the startEndpoint as shown in the example linked below. Please note that currently only one language can be specified per call and support for detecting multiple languages in the same call will be added soon.

Specifying Time Zone

With calls taking place in different regions around the world it can get important to capture that information and utilise it. Voice SDK allows you to pass in the timeZone in the startEndpoint call which will render the Summary UI in that specific Time Zone. A list of all the Time Zones is available here

Passing Language and TimeZone

A complete example showcasing this capability can be found here

Complete Example

const {
  sdk,
  SpeakerEvent
} = require('symbl-node');

sdk.init({
  appId: 'yourAppId',
  appSecret: 'yourAppSecret',
  basePath: 'https://api.symbl.ai'
}).then(() => {

  console.log('SDK Initialized');
  sdk.startEndpoint({
    endpoint: {
      type: 'pstn',
      phoneNumber: '14087407256',
      dtmf: '6327668#'
    }
  }).then(connection => {

    const connectionId = connection.connectionId;
    console.log('Successfully connected.', connectionId);
    const speakerEvent = new SpeakerEvent({
      type: SpeakerEvent.types.startedSpeaking,
      user: {
        userId: 'john@example.com',
        name: 'John'
      }
    });

    setTimeout(() => {
      speakerEvent.timestamp = new Date().toISOString();
      sdk.pushEventOnConnection(
        connectionId,
        speakerEvent.toJSON(),
        (err) => {
          if (err) {
            console.error('Error during push event.', err);
          } else {
            console.log('Event pushed!');
          }
        }
      );
    }, 2000);

    setTimeout(() => {
      speakerEvent.type = SpeakerEvent.types.stoppedSpeaking;
      speakerEvent.timestamp = new Date().toISOString();

      sdk.pushEventOnConnection(
        connectionId,
        speakerEvent.toJSON(),
        (err) => {
          if (err) {
            console.error('Error during push event.', err);
          } else {
            console.log('Event pushed!');
          }
        }
      );
    }, 12000);

    // Scheduling stop endpoint call after 60 seconds
    setTimeout(() => {
      sdk.stopEndpoint({
        connectionId: connection.connectionId
      }).then(() => {
        console.log('Stopped the connection');
        console.log('Summary Info:', connection.summaryInfo);
        console.log('Conversation ID:', connection.conversationId);
      }).catch(err => console.error('Error while stopping the connection.', err));
    }, 90000);

  }).catch(err => console.error('Error while starting the connection', err));

}).catch(err => console.error('Error in SDK initialization.', err));

Below is a quick simulated speaker event example that

  1. Initializes the SDK
  2. Initiates a connection with an endpoint
  3. Sends a speaker event of type startedSpeaking for user John
  4. Sends a speaker event of type stoppedSpeaking for user John
  5. Ends the connection with the endpoint

Strictly for illustration and understanding purposes, the code to the right pushes events by simply using setTimeout() method periodically, but in real usage, they should be pushed as they occur.

const {
  sdk,
  SpeakerEvent
} = require('symbl-node');

sdk.init({
  appId: 'yourAppId',
  appSecret: 'yourAppSecret',
  basePath: 'https://api.symbl.ai'
}).then(() => {
  console.log('SDK Initialized');
  sdk.startEndpoint({
    endpoint: {
      type: 'pstn',
      phoneNumber: '14087407256',
      dtmf: '6327668#'
    },
    actions: [{
      "invokeOn": "stop",
      "name": "sendSummaryEmail",
      "parameters": {
        "emails": [
          "john@exmaple.com",
          "mary@example.com",
          "jennifer@example.com"
        ]
      }
    }],
    data: {
      session: {
        name: 'My Meeting Name' // Title of the Meeting
      },
      users: [{
          user: {
            name: "John",
            userId: "john@example.com",
            role: "organizer"
          }
        },
        {
          user: {
            name: "Mary",
            userId: "mary@example.com"
          }
        },
        {
          user: {
            name: "John",
            userId: "jennifer@example.com"
          }
        }
      ]
    }
  }).then((connection) => {
    console.log('Successfully connected.');

    // Events pushed in between
    setTimeout(() => {
      // After successful stop endpoint, an email with summary will be sent to "john@example.com" and "jane@example.com"
      sdk.stopEndpoint({
        connectionId: connection.connectionId
      }).then(() => {
        console.log('Stopped the connection');
        console.log('Summary Info:', connection.summaryInfo);
        console.log('Conversation ID:', connection.conversationId);
      }).catch(err => console.error('Error while stopping the connection.', err));
    }, 30000);

  }).catch(err => console.error('Error while starting the connection', err));

}).catch(err => console.error('Error in SDK initialization.', err));





















































Send Summary Email

This is an example of the summary page you can expect to receive at the end of your call

Summary Page

Take a look at the Sample Summary UI which is generated after a meeting is concluded.

Tuning your Summary Page

You can choose to tune your Summary Page with the help of query parameters to play with different configurations and see how the results look.

Query Parameters

You can configure the summary page by passing in the configuration through query parameters in the summary page URL that gets generated at the end of your meeting. See the end of the URL in this example:

https://meetinginsights.symbl.ai/meeting/#/eyJ1...I0Nz?insights.minScore=0.95&topics.orderBy=position

Query Parameter Default Value Supported Values Description
insights.minScore 0.8 0.5 to 1.0 Minimum score that the summary page should use to render the insights
insights.enableAssignee false [true, false] Enable to disable rending of the assignee and due date of the insight
insights.enableAddToCalendarSuggestion true [true, false] Enable to disable add to calendar suggestion when applicable on insights
insights.enableInsightTitle true [true, false] Enable or disable the title of an insight. The title indicates the originating person of the insight and if assignee of the insight.
topics.enabled true [true, false] Enable or disable the summary topics in the summary page
topics.orderBy 'score' ['score', 'position'] Ordering of the topics.

score - order topics by the topic importance score.

position - order the topics by the position in the transcript they surfaced for the first time

Streaming Audio in Real Time

This section talks about streaming the audio in real-time using the Voice SDK. We can use this API to pass in audio via a single stream and it also supports sending multiple isolated streams of audio, each of which can contain one or more speaker's audio data.

You can also consume the processed results in real-time, which include

  1. Real Time Transcription
  2. Real Time Insights (Action Items and Questions)
  3. When using multiple audio streams (Each stream for 1 speaker) you also get access to speaker-separated data (including transcription and messages)

Example with Single Stream

The example below utilises mic package to stream audio in real-time. This will be a single stream of audio obtained through mic which may have one or more than one speaker's audio. The link to the complete example below can be found here

Import required packages
const {sdk} = require('symbl-node');
const uuid = require('uuid').v4;
// For demo purposes, we're using mic to simply get audio from microphone and pass it on to websocket connection
const mic = require('mic');


In the above snippet we import the sdk, uuid and mic npm packages. The uuid package is used for generating a unique ID to represent this stream and it's strongly recommended to use it. The mic package is used to obtain the audio stream in real-time to pass to the SDK.

Initialise an instance of mic
const sampleRateHertz = 16000;
const micInstance = mic({
    rate: sampleRateHertz,
    channels: '1',
    debug: false,
    exitOnSilence: 6
});


We now declare the sampleRateHertz variable to specify the Sample Rate of the audio obtained from the mic. It is imperative to use the same Sample Rate used for initialising the mic package and for passing in to the startRealtimeRequest of Voice SDK as we will see below. Otherwise the transcription will be completely in-accurate.

We also initialise mic with channels: '1' (mono channel) audio as currently only mono channel audio data is supported.

Initialise the Voice SDK
// Initialize the SDK
await sdk.init({
    appId: '__appId__',
    appSecret: '__appSecret__',
    basePath: 'https://api.symbl.ai'
});
// Need unique Id
const id = uuid();


Next we initialise a helper function to execute our code in the async/await style. The following code snippets (including the one just above) will be a part of the same function. We now initialise the Voice SDK with the init call, passing in appId and appSecret which you can be obtain by signing up on Symbl Developer Platform We also initialise variable id with uuid function for the unique ID required for this stream as was also mentioned above in the import section snippet.

Call the startRealtimeRequest
// Start Real-time Request (Uses Realtime WebSocket API behind the scenes)
const connection = await sdk.startRealtimeRequest({
    id,
    insightTypes: ["action_item", "question"],
    config: {
        meetingTitle: 'My Test Meeting',
        confidenceThreshold: 0.7,
        timezoneOffset: 480, // Offset in minutes from UTC
        languageCode: "en-US",
        sampleRateHertz
    },
    speaker: { // Optional, if not specified, will simply not send an email in the end.
        userId: 'john.doe@example.com', // Update with valid email
        name: 'John'
    },
    handlers: {
        'onSpeechDetected': (data) => {
            console.log(JSON.stringify(data));
            // For live transcription
            if (data) {
                const {punctuated} = data;
                console.log('Live: ', punctuated && punctuated.transcript);
            }
        },
        'onMessageResponse': (data) => {
            // When a processed message is available
            console.log('onMessageResponse', JSON.stringify(data));
        },
        'onInsightResponse': (data) => {
            // When an insight is detected
            console.log('onInsightResponse', JSON.stringify(data));
        }
    }
});


The next call is made to startRealtimeRequest of the Voice SDK and includes various parameters passed in. Lets breakdown the configuration and take a look at them one by one.

  1. id: The unique ID that represents this stream. (This needs to be unique, which is why we are using uuid)

  2. insightTypes: This array represents the type of insights that are to be detected. Today the supported ones are action_item and question.

  3. config: This configuration object encapsulates the properties which directly relate to the conversation generated by the audio being passed.

    a. meetingTitle : This optional parameter specifies the name of the conversation generated. You can get more info on conversations here

    b. confidenceThreshold : This optional parameter specifies the confidence threshold for detecting the insights. Only the insights that have confidenceScore more than this value will be returned.

    c. timezoneOffset : This specifies the actual timezoneOffset used for detecting the time/date related entities.

    d. languageCode : It specifies the language to be used for transcribing the audio in BCP-47 format. (Needs to be same as the language in which audio is spoken)

    e. sampleRateHertz : It specifies the sampleRate for this audio stream.

  4. speaker: Optionally specify the details of the speaker whose data is being passed in the stream. This enables an e-mail with the Summary UI URL to be sent after the end of the stream.

  5. handlers: This object has the callback functions for different events a. onSpeechDetected: To retrieve the real-time transcription results as soon as they are detected. We can use this callback to render live transcription which is specific to the speaker of this audio stream.

    b. onMessageResponse: This callback function contains the "finalized" transcription data for this speaker and if used with multiple streams with other speakers this callback would also provide their messages. The "finalized" messages mean that the ASR has finalised the state of this part of transcription and has declared it "final".

    c. onInsightResponse: This callback would provide with any of the detected insights in real-time as they are detected. As with the onMessageCallback above this would also return every speaker's insights in case of multiple streams.

Retrieve audio data from mic
console.log('Successfully connected.');

const micInputStream = micInstance.getAudioStream();
micInputStream.on('data', (data) => {
    // Push audio from Microphone to websocket connection
    connection.sendAudio(data);
});

console.log('Started listening to Microphone.');


After the startRealtimeRequest returns successfully, it signifies that the connection has been established successfully with the passed configuration. In the above snippet we now obtain the audio data from the micInputStream and as it's received we relay it to the active connection instance we now have with Voice SDK.

Stop the stream
setTimeout(async () => {
    // Stop listening to microphone
    micInstance.stop();
    console.log('Stopped listening to Microphone.');
    try {
        // Stop connection
        await conversationData = connection.stop();
        console.log('Conversation ID: ' + conversationData.conversationId);
        console.log('Connection Stopped.');
    } catch (e) {
        console.error('Error while stopping the connection.', e);
    }
}, 60 * 1000); // Stop connection after 1 minute i.e. 60 secs


For the purpose of demoing a continuous audio stream we now simulate a stop on the above stream after 60 seconds. The connection.stop() would close the active connection and will trigger the optional email if the speaker config is included. Here the conversationData variable includes the conversationId you can use with the Conversation API to retrieve this conversation's data.

And that's it! This marks the completion of streaming audio in real-time (Single Audio Stream) with Voice SDK. The complete code for the example explained above can be found here

With Multiple Streams

The same example explained above can be deployed on multiple machines, each with one speaker to simulate the multiple streams use-case. The only thing common needs to be the unique ID created in the above example which is used to initialize startRealtimeRequest request.

Having this unique ID in common across all different ensures that the audio streams of all the speakers are bound the context of a single conversation. This conversation can be retrieved by the conversationId via the Conversation API which will include the data of all the speakers connecting using the same common ID.

Real Time Telephony API

The Voice API provides the REST interface for adding Symbl to your call and generating actionable insights from your conversations. Telephony API only allows phone numbers in the USA and Canada

POST Telephony

Example API Call

curl -k -X POST "https://api.symbl.ai/v1/endpoint:connect" \
     -H "accept: application/json" \
     -H "Content-Type: application/json" \
     -H "x-api-key: <your_auth_token>" \
     -d @location_of_fileName_with_request_payload
const request = require('request');

const payload = {
  "operation": "start",
  "endpoint": {
    "type" : "pstn",
    "phoneNumber": "<number_to_call>", // include country code
    "dtmf": "<meeting_id>" // if password protected, use "dtmf": "<meeting_id>#,#<password>#"
  },
  "actions": [{
    "invokeOn": "stop",
    "name": "sendSummaryEmail",
    "parameters": {
      "emails": [
        "joe.symbl@example.com"
      ]
    }
  }],
  "data" : {
      "session": {
          "name" : "My Meeting"
      }
  }
}

const your_auth_token = "<your_auth_token>";

request.post({
    url: 'https://api.symbl.ai/v1/endpoint:connect',
    headers: {'x-api-key': your_auth_token},
    body: payload,
    json: true
}, (err, response, body) => {
  console.log(body);
});

The above command returns an object structured like this:

{
    "eventUrl": "https://api.symbl.ai/v1/event/771a8757-eff8-4b6c-97cd-64132a7bfc6e",
    "resultWebSocketUrl": "wss://api.symbl.ai/events/771a8757-eff8-4b6c-97cd-64132a7bfc6e",
    "connectionId": "771a8757-eff8-4b6c-97cd-64132a7bfc6e",
    "conversationId": "51356232423"
}

The Telephony Voice API allows you to easily use Symbl's Language Insights capabilities.

It exposes the functionality of Symbl to dial-in to the conference. Supported endpoints are given below. Additionally, events can be passed for further processing. The supported types of events are discussed in detail in the section below.

HTTP REQUEST

POST https://api.symbl.ai/v1/endpoint:connect

Request Parameters

Parameter Type Description
operation string enum([start, stop]) - Start or Stop connection
endpoint object Object containing Type of the session - either pstn or sip, phoneNumber which is the meeting number symbl should call with country code prepended and dtmf which is the conference passcode.
actions list actions that should be performed while this connection is active. Currently only one action is supported - sendSummaryEmail
data object Object containing a session object which has a field name corresponding to the name of the meeting

Response Object

Field Description
eventUrl REST API to push speaker events as the conversation is in progress, to add additional speaker context in the conversation. Example - In an on-going meeting, you can push speaker events
resultWebSocketUrl Same as eventUrl but over WebSocket. The latency of events is lower with a dedicated WebSocket connection.
connectionId Ephemeral connection identifier of the request, to uniquely identify the telephony connection. Once the connection is stopped using “stop” operation, or is closed due to some other reason, the connectionId is no longer valid
conversationId Represents the conversation - this is the ID that needs to be used in conversation api to access the conversation


To play around with a few examples, we recommend a REST client called Postman. Simply tap the button below to import a pre-made collection of examples.

Run in Postman

Try it out

When you have started the connection through the API, try speaking the following sentences and view the summary email that gets generated:

Real Time WebSocket API

In the example below, we've used the websocket npm package for WebSocket Client, and mic for getting the raw audio from microphone.

$ npm i websocket mic

For this example, we are using your mic to stream audio data. You will most likely want to use other inbound sources for this

const WebSocketClient = require('websocket').client;

const mic = require('mic');

const micInstance = mic({
  rate: '44100',
  channels: '1',
  debug: false,
  exitOnSilence: 6
});

// Get input stream from the microphone
const micInputStream = micInstance.getAudioStream();
let connection = undefined;

Create a websocket client instance

const ws = new WebSocketClient();

ws.on('connectFailed', (err) => {
  console.error('Connection Failed.', err);
});

ws.on('connect', (connection) => {

  // Start the microphone
  micInstance.start();

  connection.on('close', () => {
    console.log('WebSocket closed.')
  });

  connection.on('error', (err) => {
    console.log('WebSocket error.', err)
  });

  connection.on('message', (data) => {
    if (data.type === 'utf8') {
      const {
        utf8Data
      } = data;
    console.log(utf8Data);  // Print the data for illustration purposes
    }
  });

  console.log('Connection established.');

  connection.send(JSON.stringify({
    "type": "start_request",
    "insightTypes": ["question", "action_item"],
    "config": {
      "confidenceThreshold": 0.9,
      // "timezoneOffset": 480, // Your timezone offset from UTC in minutes
      "languageCode": "en-US",
      "speechRecognition": {
        "encoding": "LINEAR16",
        "sampleRateHertz": 44100 // Make sure the correct sample rate is provided for best results
      },
      "meetingTitle": "Client Meeting"
    },
    "speaker": {
      "userId": "jane.doe@example.com",
      "name": "Jane"
    }
  }));

  micInputStream.on('data', (data) => {
    connection.send(data);
  });
});

For this example, we timeout our call after 2 minutes but you would most likely want to make the stop_request call when your websocket connection ends

  // Schedule the stop of the client after 2 minutes (120 sec)

  setTimeout(() => {
    micInstance.stop();
    // Send stop request
    connection.sendUTF(JSON.stringify({
      "type": "stop_request"
    }));
    connection.close();
  }, 120000);

Generate the token and replace it in the placeholder <accessToken>. Once the code is running, start speaking and you should see the message_response and insight_response messages getting printed on the console.

ws.connect(
  'wss://api.symbl.ai/v1/realtime/insights/1',
  null,
  null,
  { 'X-API-KEY': '<accessToken>' }
);

In the example below, we've used a Websocket that is compatible with most browsers and doesn't need any additional npm packages. Generate the token and replace it in the placeholder <accessToken>. Then, create the Websocket.

let url = `wss://api.symbl.ai/v1/realtime/insights/1?access_token=${'<accessToken>'}`
let ws = new WebSocket(url);

ws.onerror = (err) => {
  console.error('Connection Failed.', err);
};
ws.onopen = () => {
  console.log('Websocket open.')

  ws.onmessage = (event) => {
    if (event.type === 'message') {
      console.log(event.data);  // Print the data for illustration purposes
    }
  };

  ws.onclose = () => {
    console.log('WebSocket closed.')
  };

  ws.onerror = (err) => {
    console.log('WebSocket error.', err)
  };

  console.log('Connection established.');

  ws.send(JSON.stringify({
    "type": "start_request",
    "insightTypes": ["question", "action_item"],
    "config": {
      "confidenceThreshold": 0.9,
      // "timezoneOffset": 480, // Your timezone offset from UTC in minutes
      "languageCode": "en-US",
      "speechRecognition": {
        "encoding": "LINEAR16",
        "sampleRateHertz": 44100 // Make sure the correct sample rate is provided for best results
      },
      "meetingTitle": "Client Meeting"
    },
    "speaker": {
      "userId": "jane.doe@example.com",
      "name": "Jane"
    }
  }));
}

To get direct access to the mic, we're going to use an API in the WebRTC specification called getUserMedia().

Once the code is running, start speaking and you should see the message_response and insight_response messages getting printed on the console.

  const handleSuccess = function(stream) {
    const context = new AudioContext();
    const source = context.createMediaStreamSource(stream);
    const processor = context.createScriptProcessor(1024, 1, 1);
    source.connect(processor);
    processor.connect(context.destination);
    processor.onaudioprocess = function(e) {
      // convert to 16-bit payload
      const inputData = e.inputBuffer.getChannelData(0) || new Float32Array(this.options.bufferSize);
      const targetBuffer = new Int16Array(inputData.length);
      for (let index = inputData.length; index > 0; index--)
          targetBuffer[index] = 32767 * Math.min(1, inputData[index]);
      // Send to websocket
      if(ws.readyState === WebSocket.OPEN){
          ws.send(targetBuffer.buffer);
      }
    };
  };

  navigator.mediaDevices.getUserMedia({ audio: true, video: false })
    .then(handleSuccess);

  // Schedule the stop of the client after 2 minutes (120 sec)
  setTimeout(() => {
    // Send stop request
    ws.send(JSON.stringify({
      "type": "stop_request"
    }));
    ws.close();
  }, 120000);

Introduction

The WebSocket based real-time API by Symbl provides the direct, fastest and most accurate of all other interfaces to push the audio stream in real-time, and get the results back as soon as they're available.

Connection Establishment

This is a WebSocket endpoint, and hence it starts as an HTTP request that contains HTTP headers that indicate the client's desire to upgrade the connection to a WebSocket instead of using HTTP semantics. The server indicates its willingness to participate in the WebSocket connection by returning an HTTP 101 Switching Protocols response. After the exchange of this handshake, both client and service keep the socket open and begin using a message-based protocol to send and receive information. Please refer to WebSocket Specification RFC 6455 for the more in-depth understanding of the Handshake process.

Message Formats

Client and Server both can send messages after the connection is established. According to RFC 6455, WebSocket messages can have either a text or a binary encoding. The two encodings use different on-the-wire formats. Each format is optimized for efficient encoding, transmission, and decoding of the message payload.

Text Message

Text message over WebSocket must use UTF-8 encoding. Text Message is the serialized JSON message. Every text message has a type field to specify the type or the purpose of the message.

Binary Message

Binary WebSocket messages carry a binary payload. For the Real-time API, audio is transmitted to the service by using binary messages. All other messages are the Text messages.

Client Messages

This section describes the messages that originate from the client and are sent to service. The types of messages sent by the client are start_request, stop_request and binary messages containing audio.

Configuration

Main Message Body

Field Required Supported Values Description
type true start_request, stop_request Type of message
insightTypes false action_item, question Types of insights to return. If not provided, no insights will be returned.
config false Configuration for this request. See the config section below for more details.
speaker false Speaker identity to use for audio in this WebSocket connection. If omitted, no speaker identification will be used for processing. See below.

config

Field Required Supported Values Default Value Description
confidenceThreshold false 0.0 - 1.0 0.5 Minimum Confidence score that should be met for API to consider it as valid insight, if not provided defaults to 0.5 i.e. 50% or more
<!-- timezoneOffset false 0
languageCode false en-US The language code as per the BCP 47 specification
speechRecognition false Speaker identity to use for audio in this WebSocket connection. If omitted, no speaker identification will be used for processing. See below.

speechRecognition

Field Required Supported Values Default Value Description
encoding false LINEAR16, FLAC, MULAW LINEAR16 Audio Encoding in which the audio will be sent over the WebSocket.
sampleRateHertz false 16000 The rate of the incoming audio stream.

speaker

Field Required Description
userId false Any user identifier for the user.
name false Display name of the user.

Messages

Start Request

{
  "type": "start_request",
  "insightTypes": ["question", "action_item"],
  "config": {
    "confidenceThreshold": 0.9,
    <!-- "timezoneOffset": 480, -->
    "languageCode": "en-US",
    "speechRecognition": {
      "encoding": "LINEAR16",
      "sampleRateHertz": 16000
    }
  },
  "speaker": {
    "userId": "jane.doe@example.com",
    "name": "Jane"
  }
}


This is a request to start the processing after the connection is established. Right after this message has been sent, the audio should be streamed, any binary audio streamed before the receipt of this message will be ignored.

Stop Request

{
  "type": "stop_request"
}


This is a request to stop the processing. After the receipt of this message, the service will stop any processing and close the WebSocket connection.

Example of the message_response object

{
  "type": "message_response",
  "messages": [
    {
      "from": {
        "name": "Jane",
        "userId": "jane.doe@example.com"
      },
      "payload": {
        "content": "I was very impressed by your profile, and I am excited to know more about you.",
        "contentType": "text/plain"
      }
    },
    {
      "from": {
        "name": "Jane",
        "userId": "jane.doe@example.com"
      },
      "payload": {
        "content": "So tell me, what is the most important quality that you acquired over all of your professional career?",
        "contentType": "text/plain"
      }
    }
  ]
}

Sending Binary Messages with Audio

The client needs to send the audio to Service by converting the audio stream into a series of audio chunks. Each chunk of audio carries a segment of audio that needs to be processed. The maximum size of a single audio chunk is 8,192 bytes.

Service Messages

This section describes the messages that originate in Service and are sent to the client.

Service sends mainly two types of messages (message_response, insight_response) to the client as soon as they're available.

Message Response

The message_response contains the processed messages as soon as they're ready and available, in the processing of continuous audio stream. This message does not contain any insights.

Insight Response

Example of the insight_response object

{
  "type": "insight_response",
  "insights": [
    {
      "type": "question",
      "text": "So tell me, what is the most important quality that you acquired over all of your professional career?",
      "confidence": 0.9997962117195129,
      "hints": [],
      "tags": []
    },
    {
      "type": "action_item",
      "text": "Jane will look into the requirements on the hiring for coming financial year.",
      "confidence": 0.9972074778643447,
      "hints": [],
      "tags": [
        {
          "type": "person",
          "text": "Jane",
          "beginOffset": 0,
          "value": {
            "value": {
              "name": "Jane",
              "alias": "Jane",
              "userId": "jane.doe@symbl.ai"
            }
          }
        }
      ]
    }
  ]
}

The insight_response contains the insights from the ongoing conversation as soon as they are available. This message does not contain any messages.

Async API

The Async API provides a REST interface to allow you to run a job asynchronously in order to process insights out of audio and text files.

Text API

POST Async Text API

The Async Text API allows you to process any text payload to get the transcription and conversational insights. It can be useful in any use case where you have access to the textual content of a type of conversation, and you want to extract the insightful items supported by the Conversation API. If you want to add more content to the same conversation, use PUT Async Text API.

Use the POST API to upload your content and generate a Conversation ID. If you want to append additional content to the same Conversation ID, use the PUT API.

Example API call

curl --location --request POST 'https://api.symbl.ai/v1/process/text' \
--header 'x-api-key: <generated_valid_token>' \
--header 'Content-Type: application/json' \
--data-raw '{
  "messages": [
    {
      "payload": {
        "content": "Hi Mike, Natalia here. Hope you don’t mind me reaching out. Who would be the best possible person to discuss internships and student recruitment at ABC Corp? Would you mind pointing me toward the right person and the best way to reach them? Thanks in advance for your help, I really appreciate it!"
      },
      "from": {
        "userId": "natalia@example.com",
        "name": "Natalia"
      }
    },
    {
      "payload": {
        "content": "Hey Natalia, thanks for reaching out. I am connecting you with Steve who handles recruitements for us."
      },
      "from": {
        "userId": "mike@abccorp.com",
        "name": "Mike"
      }
    }
  ]
}'
const request = require('request');

const options = {
  'method': 'POST',
  'url': 'https://api.symbl.ai/v1/process/text',
  'headers': {
    'Content-Type': 'application/json',
    'x-api-key': '<your_auth_token>'
  },
  body: JSON.stringify({
    "messages": [
      {
        "payload": { "content": "Okay" },
        "from": {
          "name": "John",
          "userId": "john@example.com"
        }
      },
      {
        "payload": {
          "content": "Hello, this is Peter from Vodafone, How can I help you today?. My name is Sam, and I've been gone for more than two years. I'm really interested in upgrading to the latest iPhone. Can you tell me about some options? For quality assurance and training purposes. This call may be monitored and recorded. May I have your current phone number and the complete name and address of the current account My number is 1 2 3 5 5 5 7 8 9 0 and my address is 122 Raymer Avenue Seattle 98010. Thank you for the confirmation. Being a loyal customer there are three types of plan options that I can offer you today. Do you already know what you're looking for or would you prefer a recommendation?"
        },
        "from": {
          "name": "John",
          "userId": "john@example.com"
        }
      },
      // ....
    ],
    "confidenceThreshold": 0.5
  })
};

request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

POST https://api.symbl.ai/v1/process/text

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes application/json

Request Body

Field Required Type Supported Values Default Description
messages Yes list Input Messages to look for insights. See the messages section below for more details.
confidenceThreshold No double 0.0 to 1.0 0.5 Minimum required confidence for the insight to be recognized.

messages

Field Required Type Description
payload Yes object Input Messages to look for insights. See the messages section below for more details.
from No object Information about the User information produced the content of this message.
duration No object Duration object containing startTime and endTime for the transcript.

payload

Field Required Type Default Description
contentType No string (MIME type) text/plain To indicate the type and/or format of the content. Please see RFC 6838 for more details. Currently only text/plain is supported.
from No object The content of the message in the specified MIME type in the contentType field.

from (user)

Field Required Type Description
name No string Name of the user.
userId No string A unique identifier of the user. E-mail ID is usually a preferred identifier for the user.

duration

Field Required Type Description
startTime No DateTime The start time for the particular text content.
endTime No DateTime The start time for the particular text content.

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)

WebhookUrl will be used to send the status of job created for uploaded file. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Response on reaching limit

Field Description
Payload { "message" : "This API has a limit of maximum of 5 number of concurrent jobs per account. If you are looking to scale, and need more concurrent jobs than this limit, please contact us at support@symbl.ai" }
Header { "statusCode" : 429 }

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed ])

PUT Async Text API

The Async Text API allows you to process any text payload to append the transcription of the previous conversation and get updated conversational insights. It can be useful in any use case where you have access to the textual content of a type of conversation, and you want to extract the insightful items supported by the Conversation API.

Use the POST API to upload your content and generate a Conversation ID. If you want to append additional content to the same Conversation ID, use the PUT API.

Example API call

const request = require('request');

const options = {
  'method': 'PUT',
  'url': 'https://api.symbl.ai/v1/process/text/' + your_conversationId,
  'headers': {
    'Content-Type': 'application/json',
    'x-api-key': '<your_auth_token>'
  },
  'body': JSON.stringify({
    "messages": [
      {
        "payload": { "content": "Okay" },
        "from": {
          "name": "John",
          "userId": "john@example.com"
        }
      },
      {
        "payload": {
          "content": "Hello, this is Peter from Vodafone, How can I help you today?. My name is Sam, and I've been gone for more than two years. I'm really interested in upgrading to the latest iPhone. Can you tell me about some options? For quality assurance and training purposes. This call may be monitored and recorded. May I have your current phone number and the complete name and address of the current account My number is 1 2 3 5 5 5 7 8 9 0 and my address is 122 Raymer Avenue Seattle 98010. Thank you for the confirmation. Being a loyal customer there are three types of plan options that I can offer you today. Do you already know what you're looking for or would you prefer a recommendation?"
        },
        "from": {
          "name": "John",
          "userId": "john@example.com"
        }
      },
      // ....
    ],
    "confidenceThreshold": 0.5
  })
};

request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

PUT https://api.symbl.ai/v1/process/text/:conversationId

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes application/json

Path Params

Parameter Value
conversationId ConversationId got from the POST Async API for the text content

Request Body

Field Required Type Supported Values Default Description
messages Yes list Input Messages to look for insights. See the messages section below for more details.
confidenceThreshold No double 0.0 to 1.0 0.5 Minimum required confidence for the insight to be recognized.

messages

Field Required Type Description
payload Yes object Input Messages to look for insights. See the messages section below for more details.
from No object Information about the User information produced the content of this message.
duration No object Duration object containing startTime and endTime for the transcript.

payload

Field Required Type Default Description
contentType No string (MIME type) text/plain To indicate the type and/or format of the content. Please see RFC 6838 for more details. Currently only text/plain is supported.
from No object The content of the message in the specified MIME type in the contentType field.

from (user)

Field Required Type Description
name No string Name of the user.
userId No string A unique identifier of the user. E-mail ID is usually a preferred identifier for the user.

duration

Field Required Type Description
startTime No DateTime The start time for the particular text content.
endTime No DateTime The start time for the particular text content.

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)

WebhookUrl will be used to send the status of job created for uploaded file. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

Audio API

Table of content: Async Audio API
  1. POST Async Audio API
  2. PUT Async Audio API
  3. POST Async Audio URL API
  4. PUT Async Audio URL API

System requirement: NodeJS 7+

POST Async Audio API

The Async Audio API allows you to process an audio file and return the full text transcript along with conversational insights. It can be utilized for any use case where you have access to recorded audio and want to extract insights and other conversational attributes supported by Symbl's Conversation API.

Use the POST API to upload your file and generate a Conversation ID. If you want to append additional audio information to the same Conversation ID, use the PUT API.

Example API call - The sample request accepts just the raw audio file from the data with the MIME typeset in the Content-Type Header. The audio file should only have Mono Channel.

# Wave file
curl --location --request POST 'https://api.symbl.ai/v1/process/audio?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: audio/wav' \
--header 'x-api-key: <generated_valid_token>' \
--data-binary '@/file/location/audio.wav'

# MP3 File
curl --location --request POST 'https://api.symbl.ai/v1/process/audio?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: audio/mpeg' \
--header 'x-api-key: <generated_valid_token>' \
--data-binary '@/file/location/audio.mp3'
const request = require('request');
const fs = require('fs');
const accessToken = "<your_auth_token>";
const audioFileStream = fs.createReadStream('/file/location/audio.wav');

const audioOption = {
  url: 'https://api.symbl.ai/v1/process/audio',
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'audio/wav'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  json: true,
};

audioFileStream.pipe(request.post(audioOption, (err, response, body) => {
  console.log(err, body);
}));

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

POST https://api.symbl.ai/v1/process/audio

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Describes the format and codec of the provided audio data. Accepted values are audio/wav and audio/mpeg

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)
customVocabulary No Contains a list of words and phrases that provide hints to the speech recognition task.

WebhookUrl will be used to send the status of job created for uploaded audio. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object on Success

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Response on reaching limit

Field Description
Payload { "message" : "This API has a limit of maximum of 5 number of concurrent jobs per account. If you are looking to scale, and need more concurrent jobs than this limit, please contact us at support@symbl.ai" }
Header { "statusCode" : 429 }

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

PUT Async Audio API

The Async Audio API allows you to process an additional audio file to the previous conversation, append the transcription and get conversational insights for updated conversation. It can be useful in any use case where you have access to multiple audio files of any type of conversation, and you want to extract the insightful items supported by the Conversation API.

Use the POST API to upload your file and generate a Conversation ID. If you want to append additional audio information to the same Conversation ID, use the PUT API.

Example API call - The sample request accepts just the raw audio file from the data with the MIME typeset in the Content-Type Header. The audio file should only have Mono Channel.

# Wave file
curl --location --request PUT 'https://api.symbl.ai/v1/process/audio/:conversationId?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: audio/wav' \
--header 'x-api-key: <generated_valid_token>' \
--data-binary '@/file/location/audio.wav'

# MP3 File
curl --location --request PUT 'https://api.symbl.ai/v1/process/audio/:conversationId?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: audio/mpeg' \
--header 'x-api-key: <generated_valid_token>' \
--data-binary '@/file/location/audio.mp3'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";
const audioFileStream = fs.createReadStream('/file/location/audio.wav');

const audioOption = {
  url: 'https://api.symbl.ai/v1/process/audio' + your_conversationId,
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'audio/wav'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  json: true,
};

audioFileStream.pipe(request.put(audioOption, (err, response, body) => {
  console.log(err, body);
}));

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

PUT https://api.symbl.ai/v1/process/audio/:conversationId

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Describes the format and codec of the provided audio data. Accepted values are audio/wav and audio/mpeg

Path Params

Parameter value
conversationId conversationId which is provided by the first request submitted using POST async audio API

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)
customVocabulary No Contains a list of words and phrases that provide hints to the speech recognition task.

WebhookUrl will be used to send the status of job created for uploaded audio. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

POST Async Audio URL API

The Async Audio URL API takes in a url link of your audio and returns the full text transcript along with conversational insights. It can be utilized for any use case where you have access to recorded audio stored publicly as URL and want to extract insights and other conversational attributes supported by Symbl's Conversation API.

Use the POST API to upload your file and generate a Conversation ID. If you want to append additional audio information to the same Conversation ID, use the PUT API.

Example API call

curl --location --request POST 'https://api.symbl.ai/v1/process/audio/url?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <generated_valid_token>' \
--data-raw '{
  "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_audio_file.wav",
  "confidenceThreshold": 0.6,
  "timezoneOffset": 0
}'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";

const audioOption = {
  url: 'https://api.symbl.ai/v1/process/audio/url',
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'application/json'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  body: JSON.stringify({
    "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_audio_file.wav",
    "confidenceThreshold": 0.6,
    "timezoneOffset": 0
  })
};

request.post(audioOption, (err, response, body) => {
  console.log(err, body);
});

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

POST https://api.symbl.ai/v1/process/audio/url

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Accepted value application/json

Request Body

Field Required Type Supported Values Default Description
url Yes String [] A valid url string. The URL must be a publicly accessible url.
customVocabulary No list [] Contains a list of words and phrases that provide hints to the speech recognition task.
confidenceThreshold No double 0.0 to 1.0 0.5 Minimum required confidence for the insight to be recognized.

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)

WebhookUrl will be used to send the status of job created for uploaded audio url. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object on Success

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Response on reaching limit

Field Description
Payload { "message" : "This API has a limit of maximum of 5 number of concurrent jobs per account. If you are looking to scale, and need more concurrent jobs than this limit, please contact us at support@symbl.ai" }
Header { "statusCode" : 429 }

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

PUT Async Audio URL API

The Async Audio URL API allows you to append an additional audio url to the previous conversation, append the transcription and get conversational insights for updated conversation. It can be useful in any use case where you have access to multiple recorded audio stored publicly as URL of any type of conversation, and you want to extract the insightful items supported by the Conversation API.

Use the POST API to add your URL and generate a Conversation ID. If you want to append additional audio information to the same Conversation ID, use the PUT API.

Example API call - The sample request accepts just the raw audio file from the data with the MIME typeset in the Content-Type Header. The audio file should only have Mono Channel.

curl --location --request PUT 'https://api.symbl.ai/v1/process/audio/url/:conversationId?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <generated_valid_token>' \
--data-raw '{
  "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_audio_file.wav",
  "confidenceThreshold": 0.6,
  "timezoneOffset": 0
}'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";

const audioOption = {
  url: 'https://api.symbl.ai/v1/process/audio/url/' + your_conversationId,
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'application/json'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  body: JSON.stringify({
    "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_audio_file.wav",
    "confidenceThreshold": 0.6,
    "timezoneOffset": 0
  })
};

request.put(audioOption, (err, response, body) => {
  console.log(err, body);
});

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

PUT https://api.symbl.ai/v1/process/audio/url/:conversationId

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Accepted values is application/json

Request Body

Field Required Type Supported Values Default Description
url Yes String [] A valid url string. The URL must be a publicly accessible url.
customVocabulary No list [] Contains a list of words and phrases that provide hints to the speech recognition task.
confidenceThreshold No double 0.0 to 1.0 0.5 Minimum required confidence for the insight to be recognized.

Path Params

Parameter value
conversationId conversationId which is provided by the first request submitted using POST async audio API

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)

WebhookUrl will be used to send the status of job created for uploaded audio. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

Video API

Table of content: Async Video API
  1. POST Async Video API
  2. PUT Async Video API
  3. POST Async Video URL API
  4. PUT Async Video URL API

System requirement: NodeJS 7+

POST Async Video API

The Async Video URL API takes in a url link of your video and returns the full text transcript along with conversational insights. It can be useful in any use case where you have access to the video file of any type of conversation, and you want to extract the insightful items supported by the Conversation API.

Example API call - The sample request accepts a raw video file from the data with the MIME typeset in the Content-Type Header.

# Wave file
curl --location --request POST 'https://api.symbl.ai/v1/process/video?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: video/mp4' \
--header 'x-api-key: <generated_valid_token>' \
--data-binary '@/file/location/your_video.mp4'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";
const videoFileStream = fs.createReadStream('/file/location/video.mp4');

const videoOption = {
  url: 'https://api.symbl.ai/v1/process/video',
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'video/mp4'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  json: true,
};

videoFileStream.pipe(request.post(videoOption, (err, response, body) => {
  console.log(err, body);
}));

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

POST https://api.symbl.ai/v1/process/video

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Describes the format and codec of the provided video. Accepted value video/mp4

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)
customVocabulary No Contains a list of words and phrases that provide hints to the speech recognition task.

WebhookUrl will be used to send the status of job created for uploaded video. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object on Success

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Response on reaching limit

Field Description
Payload { "message" : "Too Many Requests" }
Header { "statusCode" : 429 }

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

PUT Async Video API

The Async Audio URL API allows you to append an additional audio url to the previous conversation, append the transcription and get conversational insights for updated conversation. It can be useful in any use case where you have access to multiple video files of any type of conversation, and you want to extract the insightful items supported by the Conversation API.

Example API call - The sample request accepts just the raw video file from the data with the MIME typeset in the Content-Type Header.

# MP4 File
curl --location --request PUT 'https://api.symbl.ai/v1/process/video/:conversationId?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: video/mp4' \
--header 'x-api-key: <generated_valid_token>' \
--data-binary '@/file/location/your_video.mp4'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";
const videoFileStream = fs.createReadStream('/file/location/video.mp4');

const videoOption = {
  url: 'https://api.symbl.ai/v1/process/video/' + your_conversationId,
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'video/mp4'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  json: true,
};

videoFileStream.pipe(request.put(videoOption, (err, response, body) => {
  console.log(err, body);
}));

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

PUT https://api.symbl.ai/v1/process/video/:conversationId

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Describes the format and codec of the provided video data. Accepted value video/mp4

Path Params

Parameter value
conversationId conversationId which is provided by the first request submitted using POST async video API

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)
customVocabulary No Contains a list of words and phrases that provide hints to the speech recognition task.

WebhookUrl will be used to send the status of job created for uploaded video. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

POST Async Video URL API

The Async Video URL API allows you to process a mp4 video and return the full text transcript along with conversational insights. It can be utilized for any use case where you have access to recorded video stored publicly as URL and want to extract insights and other conversational attributes supported by Symbl's Conversation API.



Use the POST API to upload your file and generate a Conversation ID. If you want to append additional video information to the same Conversation ID, use the PUT API.

Example API call

curl --location --request POST 'https://api.symbl.ai/v1/process/video/url?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <generated_valid_token>' \
--data-raw '{
  "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_video_file.mp4",
  "confidenceThreshold": 0.6,
  "timezoneOffset": 0
}'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";

const videoOption = {
  url: 'https://api.symbl.ai/v1/process/video/url',
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'application/json'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  body: JSON.stringify({
    "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_video_file.mp4",
    "confidenceThreshold": 0.6,
    "timezoneOffset": 0
  })
};

request.post(videoOption, (err, response, body) => {
  console.log(err, body);
});

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

POST https://api.symbl.ai/v1/process/video/url

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Accepted value application/json

Request Body

Field Required Type Supported Values Default Description
url Yes String [] A valid url string. The URL must be a publicly accessible url.
customVocabulary No list [] Contains a list of words and phrases that provide hints to the speech recognition task.
confidenceThreshold No double 0.0 to 1.0 0.5 Minimum required confidence for the insight to be recognized.

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)

WebhookUrl will be used to send the status of job created for uploaded video url. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object on Success

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Response on reaching limit

Field Description
Payload { "message" : "This API has a limit of maximum of 5 number of concurrent jobs per account. If you are looking to scale, and need more concurrent jobs than this limit, please contact us at support@symbl.ai" }
Header { "statusCode" : 429 }

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

PUT Async Video URL API

The Async Video URL API allows you to process an additional video file to the previous conversation, append the transcription and get conversational insights for updated conversation.

It can be useful in any use case where you have access to multiple recorded video stored publicly as URL of any type of conversation, and you want to extract the insightful items supported by the Conversation API.

Use the POST API to add your URL and generate a Conversation ID. If you want to append additional video information to the same Conversation ID, use the PUT API.

Example API call

curl --location --request PUT 'https://api.symbl.ai/v1/process/video/url/:conversationId?webhookUrl=<your_webhook_url>' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <generated_valid_token>' \
--data-raw '{
  "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_video_file.mp4",
  "confidenceThreshold": 0.6,
  "timezoneOffset": 0
}'
const request = require('request');
const fs = require('fs');

const accessToken = "<your_auth_token>";

const videoOption = {
  url: 'https://api.symbl.ai/v1/process/video/url/' + your_conversationId,
  headers: {
    'x-api-key': accessToken,
    'Content-Type': 'application/json'
   },
  qs: {
    webhookUrl: `https://your_webhook_url`,
  },
  body: JSON.stringify({
    "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_video_file.mp4",
    "confidenceThreshold": 0.6,
    "timezoneOffset": 0
  })
};

request.put(videoOption, (err, response, body) => {
  console.log(err, body);
});

The above request returns a response structured like this:

{
  "conversationId": "5815170693595136",
  "jobId": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d"
}

HTTP REQUEST

PUT https://api.symbl.ai/v1/process/video/url/:conversationId

Request Headers

Header Name Required Value
x-api-key Yes your_auth_token
Content-Type Yes Accepted values is application/json

Request Body

Field Required Type Supported Values Default Description
url Yes String [] A valid url string. The URL must be a publicly accessible url.
customVocabulary No list [] Contains a list of words and phrases that provide hints to the speech recognition task.
confidenceThreshold No double 0.0 to 1.0 0.5 Minimum required confidence for the insight to be recognized.

Path Params

Parameter value
conversationId conversationId which is provided by the first request submitted using POST async video API

Query Params

Parameter Required Value
webhookUrl No Webhook url on which job updates to be sent. (This should be post API)

WebhookUrl will be used to send the status of job created for uploaded video url. Every time the status of the job changes it will be notified on the WebhookUrl

Response Object

Field Description
conversationId ID to be used with Conversation API
jobId ID to be used with Job API

Webhook Payload

Field Description
jobId ID to be used with Job API
status Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

Other Languages

The Async Audio and Async Video APIs can work with languages other than English. The following list of languages (with their BCP-47 language-codes) are currently supported:

  1. English (United States) – en-US
  2. English (United Kingdom) – en-GB
  3. English (Australia) – en-AU
  4. French (Canada) - fr-CA
  5. German (Germany) - de-DE
  6. Italian (Italy) – it-IT
  7. Dutch (Netherlands) – nl-NL
  8. Japanese (Japan) – ja-JP
  9. Spanish (United States) – es-US
  10. French (France) – fr-FR

To use one of the supported languages use the query-parameter languageCode with the language-codes specified above.

curl --location --request POST 'https://api.symbl.ai/v1/process/video/url?languageCode=en-US&webhookUrl=<your_webhook_url>' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <generated_valid_token>' \
--data-raw '{
  "url": "https://symbltestdata.s3.us-east-2.amazonaws.com/sample_video_file.mp4",
  "confidenceThreshold": 0.6,
  "timezoneOffset": 0
}'

If the query-parameter languageCode is not specified then en-US is used as the default language.

Currently only the messages endpoint of Conversation API will return the transcribed data and insights will be return an empty array.

Query Params

Parameter Required Value
languageCode Yes The BCP-47 language-code of the language. example: fr-CA

Speaker Diarization

The Async Audio and Async Video APIs can detect and separate unique speakers in a single stream of audio/video without need of separate speaker events.

To enable this capability with either of the APIs the enableSpeakerDiazrization and diarizationSpeakerCount query-params need to be passed with the request.

The diarizationSpeakerCount should be equal to the number of unique speakers in the conversation. If the number varies then this might introduce false positives in the diarized results.

Refer to the How-To Get Speaker Separated Transcripts - Diarization with Async API to get started with this capability!

If you’re looking for similar capability in Real-Time APIs, please refer to Speaker Events and Speaker Separation in WebSocket API sections.

Query Params

Parameter Required Value
enableSpeakerDiazrization Yes Whether the diarization should be enabled for this conversation. Pass this as true to enable this capability.
diarizationSpeakerCount Yes The number of unique speakers in this conversation.

Conversation API

The Conversation API provides the REST API interface for the management and processing of your conversations

GET conversation

Returns the conversation meta-data

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "id": "5179649407582208",
    "type": "meeting",
    "name": "Project Meeting #2",
    "startTime": "2020-02-12T11:32:08.000Z",
    "endTime": "2020-02-12T11:37:31.134Z",
    "members": [
        {
            "name": "John",
            "email": "John@example.com"
        },
        {
            "name": "Mary",
            "email": "Mary@example.com"
        },
        {
            "name": "Roger",
            "email": "Roger@example.com"
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}

Response Object

Field Description
id unique conversation identifier
type conversation type. default is meeting
name name of the conversation
startTime DateTime value
endTime DateTime value
members list of member objects containing name and email if detected

GET transcription/messages in a conversation

Returns a list of all the messages in a conversation

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/messages

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}/messages" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/messages',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "messages": [
          {
             "id": "5920275652673536",
             "text": "Yeah, but I mean all I want to know is what's the best packge do you have for me?",
             "from": {
               "name": "John",
               "email": "John@example.com"
             },
             "startTime": "2020-07-10T11:16:04.824Z",
             "endTime": "2020-07-10T11:16:09.124Z",
             "conversationId": "6749556955938816"
         },
         {
             "id": "6412283618000896",
             "text": "Best package for you is $69.99 per month at the moment.",
             "from": {
                 "name": "Roger",
                 "email": "Roger@example.com"
             },
             "startTime": "2020-07-10T11:16:21.024Z",
             "endTime": "2020-07-10T11:16:26.724Z",
             "conversationId": "6749556955938816"
         },
         {
             "id": "5661493169225728",
             "text": "Okay, Where is the file?",
             "from": {
                 "name": "John",
                 "email": "John@example.com"
             }
             "startTime": "2020-08-18T11:11:14.536Z",
             "endTime": "2020-08-18T11:11:18.536Z",
             "conversationId": "5139780136337408",
             "words": [
                 {
                     "word": "Okay,",
                     "startTime": "2020-08-18T11:11:14.536Z",
                     "endTime": "2020-08-18T11:11:14.936Z"
                 },
                 {
                     "word": "Where",
                     "startTime": "2020-08-18T11:11:14.936Z",
                     "endTime": "2020-08-18T11:11:15.436Z"
                 },
                 {
                     "word": "is",
                     "startTime": "2020-08-18T11:11:16.236Z",
                     "endTime": "2020-08-18T11:11:16.536Z"
                 },
                 {
                     "word": "the",
                     "startTime": "2020-08-18T11:11:16.536Z",
                     "endTime": "2020-08-18T11:11:16.936Z"
                 },
                 {
                     "word": "file?",
                     "startTime": "2020-08-18T11:11:16.936Z",
                     "endTime": "2020-08-18T11:11:17.236Z"
                 }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/messages

Response Object

Field Description
id unique message identifier
text message text
from user object with name and email
startTime DateTime value
endTime DateTime value
conversationId unique conversation identifier
words words object with word, startTime, endTime and speakerTag.

Query Params

Parameter Required Value Description
verbose No true Gives you word level timestamps of each sentence.

GET all members in a conversation

Returns a list of all the members in a conversation. A Member is referred to a participant in the conversation that is uniquely identified as a speaker. Identifying different participants in the meetings can be done by implementing speaker separation.

For more details on identifying members by Speaker Events or Active Talker events in Real-time using Voice SDK - here.

For more details on identifying members by independent audio stream integration using Websocket - here.

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/members

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}/members" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/members',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "members": [
        {
            "id": "fc9b35cd-361f-41c6-9029-0944d21c7150",
            "name": "John",
            "email": "John@example.com"
        },
        {
            "id": "382362a2-eeec-46a3-8891-d50508293851",
            "name": "Mary",
            "email": "Mary@example.com"
        },
        {
            "id": "b7de3a33-a16c-4926-9d4d-a904c88271c2",
            "name": "Roger",
            "email": "Roger@example.com"
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/members

Response Object

Field Description
id member's unique identifier
name member's name
email member's email

PUT update a member in the conversation

Update an existing member in an conversation. This API can be used for updating the unique speakers detected as members from diarization as well.

To diarize/separate speakers in a single audio/video stream refer to the How-To Get Speaker Separated Transcripts - Diarization with Async API

For more details on identifying members by Speaker Events or Active Talker events in Real-time using Voice SDK - here.

For more details on identifying members by independent audio stream integration using Websocket - here.

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/members/{id}

Example API call

curl --location --request PUT 'https://api.symbl.ai/v1/conversations/{conversationId}/members/fc9b35cd-361f-41c6-9029-0944d21c7150' --header 'Content-Type: application/json' --header 'x-api-key: <valid-generated-token' --data-raw '{
    "id": "fc9b35cd-361f-41c6-9029-0944d21c7150",
    "email": "john@example.com",
    "name": "John"
}'
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.put({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/members/fc9b35cd-361f-41c6-9029-0944d21c7150',
    headers: { 'x-api-key': your_auth_token },
    body: {
        id: 'fc9b35cd-361f-41c6-9029-0944d21c7150',
        name: 'John',
        email: 'john@example.com'
    },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "message": "Member with id: fc9b35cd-361f-41c6-9029-0944d21c7150 for conversationId: <conversationId> updated successfully! The update should be reflected in all messages and insights along with this conversation"
}

HTTP REQUEST

PUT https://api.symbl.ai/v1/conversations/{conversationId}/members/{id}

Request Body

Field Required Type Supported Values Default Description
id Yes string The unique identifier of the member for this conversation. This can be retrieved from the members endpoint.
name Yes string The name of the member.
email No string The email-id of the member. If specified this can be used to correctly identify and merge the existing user in case the conversation is appended with a new diarized conversation which has one or more same speakers as the conversation it's being appended to.

Response Object

Field Description
message A description of the update. This message indicates that the member details have now been updated across the conversation for all the messages and insights. You can also get the updated member from the members endpoint

GET insights from a conversation

Returns all the insights in a conversation including Topics, Questions and Action Items

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/insights

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}/insights" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/insights',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "insights": [
         {
             "id": "5802630861291520",
             "text": "We need to meet tomorrow for renewing your Insurance policy.",
             "type": "action_item",
             "score": 0.9999992500008438,
             "messageIds": [
                 "5603838367105024"
             ],
             "entities": [
                 {
                     "type": "date",
                     "text": "tomorrow",
                     "offset": 16,
                     "value": "2020-07-11"
                 }
             ],
             "phrases": [
                 {
                     "text": "We need to meet",
                     "type": "action_phrase"
                 },
                 {
                     "text": "renewing your Insurance policy",
                     "type": "action_phrase"
                 }
             ],
             "from": {
                 "name": "Salesman",
                 "userId": "sales@email.com"
             },
             "assignee": {
                 "name": "Salesman",
                 "email": "sales@email.com"
             },
             "dueBy": "2020-07-11T07:00:00.000Z"
        },
        {
           "id": "6687421370466304",
           "text": "Mark will handle the process of filling up the forms,
           call Customer after it's completed.",
           "type": "action_item",
           "score": 1,
           "messageIds": [
               "6342943141003264"
           ],
           "entities": [
               {
                   "type": "person",
                   "text": "Mark",
                   "offset": 0,
                   "value": {
                       "assignee": true,
                       "name": "Mark"
                   }
               }
           ],
           "phrases": [
               {
                   "text": "handle the process of filling up the forms",
                   "type": "action_phrase"
               }
           ],
           "from": {
               "name": "Customer",
               "userId": "customer@email.com"
           },
           "assignee": {
               "name": "Mark"
           }
       },
       {
            "id": "5642466493464576",
            "text": "I think what is the Bahamas?",
            "type": "question",
            "score": 0.9119608386876195,
            "messageIds": [
                "5114878444437504"
            ],
            "entities": [],
            "phrases": []
      },
      {
            "id": "4504448541917184",
            "text": "We need to have a call with David after this.",
            "type": "follow_up",
            "score": 0.9999100121510935,
            "messageIds": [
                "4696021397405696"
            ],
            "entities": [
                {
                    "type": "person",
                    "text": "David",
                    "offset": 34,
                    "value": {
                        "name": "David"
                    }
                }
            ],
            "phrases": [],
            "from": {
                "name": "Customer",
                "userId": "customer@email.com"
            },
            "assignee": {
                "name": "Customer",
                "email": "customer@email.com"
            }
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/insights

Response Object

Field Description
id unique conversation identifier
text conversation text
type type of insight. values could be [question, action_item]
score confidence score of the generated insight. value from 0 - 1
messageIds unique message identifiers of the corresponding messages
entities list of detected entities in the insight
assignee if an action item is generated, this field contains the name and email of the person assigned to it
phrases list of detected phrases with type - phrase type and text - corresponding text. The action_phrase type represents the actionable part of an insight

GET topics from a conversation

Returns all the topics generated from a conversation

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/topics

Example API call

curl "https://api.symbl.ai/v1/{conversationId}/topics" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/{conversationId}/topics',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "topics": [
        {
            "id": "5179649407582208",
            "text": "speakers",
            "type": "topics",
            "score": 0.9730208796076476,
            "messageIds": [
                "e16d5c97-93ff-4ebf-aff7-8c6bba54747c"
            ],
            "entities": [
                {
                    "type": "rootWord",
                    "text": "speakers"
                }
            ]
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/topics

Response Object

Field Description
id unique conversation identifier
text conversation text
type response type. default is topics
score confidence score of the generated topic. value from 0 - 1
messageIds unique message identifiers of the corresponding messages
entities list of detected entity objects in the insight with type - entity type and text - corresponding text

GET questions from a conversation

Returns all the questions generated from the conversation

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/questions

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}/questions" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/questions',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "questions": [
        {
            "id": "5179649407582208",
            "text": "Push them for the two weeks delivery, right?",
            "type": "question",
            "score": 0.9730208796076476,
            "messageIds": [
                "5019269922291712"
            ],
            "entities": []
        },
        {
            "id": "5642466493464576",
            "text": "I think what is the Bahamas?",
            "type": "question",
            "score": 0.9119608386876195,
            "messageIds": [
                "5019269922291712"
            ],
            "entities": []
        },
        {
            "id": "5756718797553664",
            "text": "Okay need be detained, or we can go there in person and support them?",
            "type": "question",
            "score": 0.893303149769215,
            "messageIds": [
                "5019269922291712"
            ],
            "entities": []
        },
        {
            "id": "6235991715086336",
            "text": "Why is that holiday in US from 17?",
            "type": "question",
            "score": 0.9998053310511206,
            "messageIds": [
                "5019269922291712"
            ],
            "entities": []
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/questions

Response Object

Field Description
id unique conversation identifier
text conversation text
type response type. default is question
score confidence score of the generated topic. value from 0 - 1
messageIds unique message identifiers of the corresponding messages
entities list of detected entity objects in the insight with type - entity type and text - corresponding text

GET action items from a conversation

Returns a list of all the action items generated from the conversation

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/action-items

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}/action-items" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/action-items',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "actionItems": [
        {
            "id": "5633940379402240",
            "text": "Mary thinks we need to go ahead with the TV in Bangalore.",
            "type": "action_item",
            "score": 0.8659442937321238,
            "messageIds": [
                "4972726972317696"
            ],
            "phrases": [
                {
                    "type": "action_phrase",
                    "text": "create small file of like mp44 testing purpose"
                }
            ],
            "definitive": false,
            "entities": [],
            "assignee": {
                "name": "Mary",
                "email": "Mary@example.com"
            }
        },
        {
            "id": "5668855401676800",
            "text": "Call and Stephanie also brought up something to check against what Ison is given as so there's one more test that we want to do.",
            "type": "action_item",
            "score": 0.8660254037845785,
            "messageIds": [
                "6531517035577342"
            ],
            "phrases": [],
            "definitive": true,
            "entities": [],
            "assignee": {
                "name": "John",
                "email": "John@example.com"
            }
        },
        {
            "id": "5690029162627072",
            "text": "Checking the nodes with Eisner to make sure we covered everything so that will be x.",
            "type": "action_item",
            "score": 0.8657734634985154,
            "messageIds": [
                "6531517035577244"
            ],
            "phrases": [
                {
                    "type": "action_phrase",
                    "text": "Checking the nodes with Eisner to make sure we covered everything"
                }
            ],
            "definitive": true,
            "entities": [],
            "assignee": {
                "name": "John",
                "email": "John@example.com"
            }
        },
        {
            "id": "5707174000984064",
            "text": "Roger is going to work with the TV lab and make sure that test is also included, so we are checking to make sure not only with our complaints.",
            "type": "action_item",
            "score": 0.9999962500210938,
            "messageIds": [
                "6531517035527344"
            ],
            "phrases": [
                {
                    "type":"action_phrase",
                    "text":"Roger is going to work with the TV lab"
                }
            ],
            "definitive": true,
            "entities": [],
            "assignee": {
                "name": "Roger",
                "email": "Roger@example.com"
            }
        },
        {
            "id": "5757280188366848",
            "text": "Mary thinks it really needs to kick start this week which means the call with UV team and our us team needs to happen the next couple of days.",
            "type": "action_item",
            "score": 0.9999992500008438,
            "messageIds": [
                "6521517035577344"
            ],
            "phrases": [],
            "definitive": false,
            "entities": [],
            "assignee": {
                "name": "Mary",
                "email": "Mary@example.com"
            },
            "dueBy": "2020-02-10T07:00:00.000Z"
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/action-items

Response Object

Field Description
id unique conversation identifier
text conversation text
type response type. default is action_item
score confidence score of the generated topic. value from 0 - 1
messageIds unique message identifiers of the corresponding messages
entities list of detected entity objects in the insight with type - entity type and text - corresponding text
definitive Boolean indicating if the action-item is definitive or not. More about definitive here.
phrases list of detected phrases with type - phrase type and text - corresponding text. The action_phrase type represents the actionable part of an insight
assignee this field contains the name and email of the person assigned to the action item

GET follow ups from a conversation

Returns a list of all the follow ups generated from the conversation

API Endpoint

https://api.symbl.ai/v1/conversations/{conversationId}/follow-ups

Example API call

curl "https://api.symbl.ai/v1/conversations/{conversationId}/follow-ups" \
    -H "x-api-key: <api_token>"
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/conversations/{conversationId}/follow-ups',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
    console.log(body);
});

The above request returns a response structured like this:

{
    "followUps": [
        {
            "id": "4526427164639111",
            "text": "We need to have the meeting today, and we're going to talk about how to run a product strategy Workshop is by Richard Holmes.",
            "type": "follow_up",
            "score": 0.8660254037851491,
            "messageIds": [
                "4675554024357888"
            ],
            "entities": [
                {
                    "type": "date",
                    "text": "today",
                    "offset": 28,
                    "value": "2020-06-22"
                },
                {
                    "type": "person",
                    "text": "Richard Holmes",
                    "offset": 110,
                    "value": {
                        "name": "Richard Holmes"
                    }
                }
            ],
            "phrases": [
                {
                    "text": "need to have the meeting today",
                    "type": "action_phrase"
                },
                {
                    "text": "talk about how to run a product strategy Workshop is by Richard Holmes",
                    "type": "action_phrase"
                }
            ],
            "from": {},
            "assignee": {},
            "dueBy": "2020-06-22T07:00:00.000Z"
        }
    ]
}

HTTP REQUEST

GET https://api.symbl.ai/v1/conversations/{conversationId}/follow-ups

Response Object

Field Description
id unique conversation identifier
text conversation text
type response type. default is follow_up
score confidence score of the generated topic. value from 0 - 1
messageIds unique message identifiers of the corresponding messages
entities list of detected entity objects in the insight with type - entity type and text - corresponding text
from user object with name and email
assignee this field contains the name and email of the person assigned to the follow up
phrases list of detected phrases with type - phrase type and text - corresponding text. The action_phrase type represents the actionable part of an insight

Job API

The Job Status API is used to retrieve the status of an ongoing async audio request. You can use the jobId received in the successful response of the Async API.

GET Job Status

Returns the status of the ongoing Async job request

API Endpoint

https://api.symbl.ai/v1/job/{jobId}

Example API call

curl --location --request GET 'https://api.symbl.ai/v1/job/{jobId}' \
--header 'Content-Type: application/json' \
--header 'x-api-key: <generated_valid_token>'
const request = require('request');
const your_auth_token = '<your_auth_token>';

request.get({
    url: 'https://api.symbl.ai/v1/job/{jobId}',
    headers: { 'x-api-key': your_auth_token },
    json: true
}, (err, response, body) => {
  console.log(body);
});

The above request returns a response structured like this:

{
  "id": "9b1deb4d-3b7d-4bad-9bdd-2b0d7b3dcb6d",
  "status": "in_progress"
}

HTTP REQUEST

GET https://api.symbl.ai/v1/job/{jobId}

Response Parameters

Parameter Description
id The ID of the Job
status Is one of type scheduled, in_progress, completed, failed

Testing

Now that you've built out an integration using either the Voice SDK or Voice API, let's test to make sure your integration is working as expected.

  1. If you are dialed in with your phone number, try speaking the following sentences to see the generated output

2. If you are dialed into a meeting, try running any of the following videos with your meeting platform open and view the summary email that gets generated:

3. Try tuning your summary page with query parameters to customize your output.

Errors

// example auth token is incorrect
{
    "message": "Token validation failed for provided token."
}

Symbl uses the following HTTP codes:

Error Code Meaning
200 OK -- Success.
201 Accepted -- Your request is successfully accepted.
400 Bad Request -- Your request is invalid.
401 Unauthorized -- Your API key is invalid.
403 Forbidden
404 Not Found -- The specified resource does not exist.
405 Method Not Allowed -- You tried to access an api with an invalid method.
413 Request Entity Too Large.
429 Too Many Requests -- Too many requests hit the API too quickly.
500 Internal Server Error -- We had a problem with our server. Try again later.

Resources

Async Audio Conversion

The below snippet shows how you can convert a file from .mp4 to .mp3 using the fluent-ffmpeg node module

const ffmpeg = require('fluent-ffmpeg');

ffmpeg('/my/path/to/original/file.mp4')
    .format('mp3')
    .on('end', () => {
        console.log('end');
    })
    .on('error', (err) => {
        // error
    })
    .save('/path/to/output/file.mp3');

Async Audio API supports files of either .wav or .mp3 and the file must have mono-channel audio only. Any other file formats can be converted using the code snippet from FFmpeg

$ npm install --save fluent-ffmpeg

Automatic Speech Recognition (ASR) Evaluation

This is a simple utility to perform a quick evaluation on the results generated by any Speech to text (STT) or Automatic Speech Recognition (ASR) System.

This utility can calculate following metrics:

Installation

$ npm install -g speech-recognition-evaluation

Usage

Simplest way to run your first evaluation is by simply passing original and generated options to asr-eval command. Where, original is a plain text file containing original transcript to be used as reference; usually this is generated by human beings. And generated is a plain text file containing generated transcript by the STT/ASR system.

$ asr-eval --original ./original-file.txt --generated ./generated-file.txt


For more information please visit this.

Media convertor

You can quickly transcode an audio file using transcodeMediaFile method.

const {transcodeMediaFile} = require('symbl-media');
(async () => {
    try {
        const result = await transcodeMediaFile('./my-input-file.wav', 'my-output-file.mp3', 'mp3');
        console.log('Successfully transcoded to: ', result.outPath);
    } catch (e) {
        console.error(e);
    }
})();

Currently this utility only supports one feature:

This utility can be used as a library in your NodeJS code. You can simply install it in your local project.

$ npm install symbl-media --save


Use the transcode command to transcode the file.

$ media transcode -i ./my-input-file.wav -o ./my-output-file.mp3 -f mp3


For more information please visit this.

Overview

Symbl is an API platform for developers and businesses to rapidly deploy conversational intelligence at scale – on any channel of communication. Our comprehensive suite of APIs unlock proprietary machine learning algorithms that can ingest any form of conversation data to identify actionable insights across domains, timelines, and channels (voice, email, chat, social) contextually – without the need for any upfront training data, wake words, or custom classifiers.

What is conversational intelligence?

In its pure form, conversation intelligence refers to the ability to communicate in ways that create a shared concept of reality. It begins with trust and transparency to remove biases in decisions, enable participants, such as knowledge workers, to be more effective at their core function, eradicate mundane and repetitive parts of their work and empower participants both at work and beyond.

Here at Symbl.ai, we are using hybird approaches of machine learning and deep learning to augment human capability by analyzing conversations and surface knowledge and actions that matter.

How is Symbl.ai different from chatbot platforms?

In short: chatbots are intent-based, rule-based, often launched by ‘wake words’, and enable short conversations between humans and machines.

Symbl.ai is a developer platform and service capable of understanding context and meaning in natural conversations between humans. It can surface the things that matter in real-time, e.g. questions, action items, insights, contextual topics, signals, etc.

Additionally:

Chatbots or virtual assistants are commonly command-driven and often referred to as conversation AI systems. They add value to direct human-machine interaction via auditory or textual methods, and attempt to convincingly simulate how a human would behave in a conversation.

You can build chatbots by using existing intent-based systems like RASA, DialogFlow, Watson, Lex, etc. These systems identify intent based on the training data you provide, and these systems enable you to create rule-based conversation workflows between humans and machines.

We are building a platform that can contextually analyze natural conversations between two or more humans based on the meaning as opposed to keywords or wake words. We are also building it using models that require no training, so you can analyze conversations on both audio or text channels to get recommendations of outcomes without needing to train a custom engine for every new intent.

Next: explore supported use cases

Use Cases

Using Symbl you can build use cases for support, sales, collaboration apps and for workflow automation for single or multiple conversations to identify real-time growth opportunities, create indexed knowledge from conversations and drive productivity.

Meetings & UCaaS

Applying primarily to unified communication and collaboration platforms (UCaaS), you can add real-time recommendations of action items and next steps as part of your existing workflow. This would meaningfully improve meeting productivity by surfacing the things that matter, as the meeting occurs. Beyond real-time action items, take advantage of automated meetings summaries delivered to your preferred channel, like email, chat, Slack, calendar, etc.

Use real-time contextual recommendations to enable participants to drive efficiencies in their note-taking, save time and focus more on the meeting itself. Action items are surfaced contextually and in real-time and can be automated to trigger your existing workflows.

Post-meeting summaries are helpful for users that like to get more involved in the conversation as it happens, and prefer re-visiting information and action items post-meeting.

Benefits:

Customer Care & CCaaS

As we understand it, customer care performance can be measured by 3 proxy metrics: customer satisfaction, time spent on call, and the number of calls serviced.

What if the introduction of a real-time passive conversation intelligence service into each call was to improve all 3 metrics at once? Real-time contextual understanding leads to suggested actions that a customer care agent can act upon during the call, enabling the agent to:

  1. Focus on the human connection with the customer.
  2. Come to a swifter resolution thanks to task automation
  3. Serve more customers with elevated experience during a shift.

Further, the Symbl.ai platform is also capable of automating post-call data collection. This enables analysis of support conversations over time, agents, shifts,and groups, which leads to a better understanding of pain-points, topics of customer support conversation, etc.

Benefits: Support Organization

Sales Enablement & CRM

Digital communication platforms used for sales engagements and customer interactions need to capture conversational data for benchmarking performance, improve net sales, and for identifying and replicating the best-performing sales scripts.

Use Symbl.ai to identify top-performing pitches by leveraging real-time insights. Accelerate the sales cycle by automating suggested action items in real-time, such as scheduling tasks and follow-ups via outbound work tool integrations. Keep your CRM up to date by automating the post-call entry with useful summaries.

Benefits: Sales Agent

Benefits: Sales Enablement / VP of Sales

Social Media Conversations

Customers interact a lot with Brands on social media and other digital channels. These interactions include feedback, reviews, complaints, and a lot of other mentions. This is valuable data if used properly to derive insights for the business.

Symbl's APIs can be used along with social listening tools to extract, categorize all of this into actionable insights. For example, topics can be very helpful in abstracting data from product reviews, threads of conversation, and social media comments. Questions and requests from social interactions and forums can be identified to build a knowledge base and direct the customer conversations to the right resources.

With the right integrations to CRM tools and Knowledgebase, insights from social conversations can lead to better understanding customer sentiment towards the brand and more efficient customer service on social channels.

Benefits for Brands

Next: Learn more about the capabilities of the platform

Capabilities

Transcript

The platform provides a searchable transcript with word level timestamps and speaker information. The transcript is a refined output of the speech-to-text conversion.

The transcript is one of the easiest ways to navigate through the entire conversation. It can be sorted using speaker-specific or topic-specific filters. Additionally, each insight or action item can also lead to related parts of the transcript.
Transcripts can generated in both real-time and asynchronous manner for voice and video conversations. They can also be accessed through the pre-built post-conversation summary UI.

The pre-built UI enables editing, copying and sharing of transcripts from the conversation and can be enhanced to support the needs of desired user experience.

Summary Topics

Summary topics provide a quick overview of the key things that were talked about in the conversation.

Action Items

An action item is a specific outcome recognized in the conversation that requires one or more people in the conversation to take a specific action, e.g. set up a meeting, share a file, complete a task, etc.

Action Item Features

Tasks

Definitive action items that are not follow-ups are categorized as tasks.

Example:
"I will complete the presentation that needs to be presented to the management by the end of today". Here, a person is really committed to completing the presentation (task) by the end of today.

Follow Ups

The platform can recognize if an action item has a connotation, which requires following up in general or by someone in particular.

Examples:

Follow-ups can also be non-definitive

Example:

Type Follow Up Non-Follow Up
Non-Definitive Follow Up (non defined) Idea/Opinion
Definitive Follow Up (defined data) Task

Other Insight Types

Questions

Any explicit question or request for information that comes up during the conversation, whether answered or not, is recognized as a question.

Example:

Suggestive Actions

For each of the Action Items identified from the conversation, certain suggestive actions are recommended based on available worktool integrations. Action phrases within the action items can be also used to map to specific actions or trigger workflows based on custom business rules.

Example:

Outbound Work Tool Integrations

The platform currently offers email as out-of-box integration with the NodeJS SDK and calendar integration on the pre-built post conversation summary UI. However, this can be extended to any work tool using extensible webhooks, where the actionable insights need to be pushed to enhance productivity and reduce the time taken by users to manually enter information from conversations. The same integrations can be enabled as suggestive actions to make this even quicker.

Some of the examples of these work tools that can be integrated using the extensible webhooks can be:

Reusable and Customizable UI Components

The UI components can be widely divided into two areas:
1. Symbl JS Elements
2. Prebuilt Summary UI

Symbl JS Elements

Symbl JS elements helps developers embed customizable JS elements for transcription, insights and action items for both real-time and post conversation experience. These are customizable, embeddable components that can be used to simplify the process of building the experience with the desired branding, as applicable.

The Symbl JS elements are releasing in preview shortly, please send an email to devrelations@symbl.ai to get early access to the Symbl JS elements for:

Prebuilt Summary UI

Symbl provides a few prebuilt summary UI that can be used to generate a user experience of the understanding of the conversation after it has been processed. The pre-built summary UI is available as a URL that can be shared via email to all (or selected) participants or used to embed as a link as part of the conversation history within the application.

The prebuilt summary UI includes the following capabilities: