Streaming API

For real-time use cases such as providing conversation intelligence for live audio streaming from a web application. The Symbl.ai Streaming API uses the WebSocket protocol to provide interactive two-way communication with the Symbl.ai servers.

The Streaming API is ideal for situations where real-time conversation is occurring and low-latency results are required. By leveraging the WebSocket protocol, there is no need to poll the server for updates — events are streamed directly to your client as Symbl.ai processes the real-time conversation.

This reference document describes the JSON message formats and responses for the Streaming API. For more general information and to get started with the Streaming API, see Streaming API.


Authentication

To authenticate with Symbl.ai, the Streaming API requires an access token. The access token is a JSON web token that you generate using your Symbl.ai app ID and secret, available on the Symbl.ai platform.

For more information about obtaining an access token, see Authenticate.

For reference information about the request to get a token, see Generate token.


Connection IDs

The connection ID is an arbitrary unique value used to identify the active connection to the Streaming API. Use the connection ID to establish a new conversation, or to join or subscribe to an ongoing conversation if the connection ID already exists.

To connect to the Streaming API, you must provide a connection ID in the WebSocket URL. The following list includes examples of valid values you can use as connection IDs:

  • UUIDs: 63617458-40df-11ed-b878-0242ac120002, 29fd7978-7104-4533-a294-055828cb861f
  • Base64 strings: ZXhhbXBsZXN0cmluZw==, -wAmKVF9z8gEKXuVl28dOQ
  • Hex strings: f3046f4c099e7dbb9175d598a7a34bcd, d6706e655d90faa43a3f94fe5dea03b9

After a conversation is established, use the connection ID to connect additional users to the same real-time conversation. The connection ID is also required to use the Subscribe API.


Connect to the Streaming API

To connect to the Streaming API, open a WebSocket connection to the Symbl.ai servers. You must provide an access token when you open the WebSocket connection. Opening the WebSocket connection does not start capturing your real-time conversation; that step is performed later by sending a start_request message.

Streaming API URL

Use the following URL when you open a WebSocket connection to the Streaming API:

wss://api.symbl.ai/v1/streaming/{CONNECTION_ID}?access_token={ACCESS_TOKEN}

Where:

  • ACCESS_TOKEN is an access token that you generate with your app ID and secret.
  • CONNECTION_ID is an arbitrary value that you provide.
    • If the connection ID does not correspond to any existing Streaming API connections for your account, then a new WebSocket connection is established. After a start request is sent to the connection, a unique conversation ID is also provided.
    • If the connection ID does correspond to an existing Streaming API connection for your account, then you are connected to the existing WebSocket connection as a new speaker. If you want to connect to an existing conversation to receive events only, do not connect as a speaker, use the Subscribe API.

Please refer to WebSocket Specification RFC 6455 for a technical explanation of the WebSocket handshake process.


Message reference

The WebSocket protocol supports sending text messages in JSON format and binary messages for communicating various data. The Streaming API utilizes text messages to deliver conversation intelligence and transcription, and binary messages to receive and process audio. This section describes the following:

  • The formats and responses of the JSON text messages that can be exchanged with the Streaming API.
  • The binary messages that can be sent to the Symbl.ai servers in order to stream audio from real-time conversations.

The Streaming API accepts the following messages:

start_request

Use the start_request message to start or join a conversation. The message includes configuration information for the conversation and speaker, such as the name of the meeting and the encoding for the speaker's audio stream.

{
    "type": "start_request",
    "config": {
      "confidenceThreshold": number,
      "detectEntities": boolean,
      "languageCode": string,
      "meetingTitle": string,
      "sentiment": boolean,
      "trackers": {
      	"enableAllTrackers": boolean,
      	"interimResults": boolean
    	},
      "speechRecognition": {
        "encoding": string,
        "sampleRateHertz": number
      }
    },
    "customVocabulary": array,
    "disconnectOnStopRequest": boolean,
    "disconnectOnStopRequestTimeout": number,
    "enableAllTrackers": boolean,
    "insightTypes": array,
    "noConnectionTimeout": boolean,
    "speaker": {
      "name": string,
      "userId": string
    }
  }

The following table describes the message fields.

FieldDescription
typeRequired string.

The value must be start_request.
configOptional object. If not included, the default values for the object fields are used.

Stores fields that are used to configure the conversation. The config object includes the following fields:


  • confidenceThreshold

  • detectEntities

  • languageCode

  • meetingTitle

  • speechRecognition

The fields are described in this table.
config.confidenceThresholdOptional number. The default value is 0.5.

Minimum confidence score that must be met for a detected insight (questions, topics, action items) to be considered valid and returned by the Streaming API. The value must be a number from 0.5 to 1.0.

Generally, a higher confidence value means the insights from the Streaming API are higher quality but fewer are returned.
config.detectEntitiesOptional boolean. The default value is false.

Enables the detection of custom and managed entities, such as PII, PCI, and PHI data. For more information about the Entity Detection feature, see Entity Detection. To enable the Entity Detection feature, the value must be true.

If an entity is detected, the Streaming API sends an entity_response message. For more information, see Response reference.
config.languageCodeOptional string. The default value is en-US.

The language code that corresponds to the language of the real-time conversation. Valid values are en-US and es-ES.
config.meetingTitleOptional string. If not set, the default value is the conversation ID that is generated for the real-time conversation.

A name for the conversation.
config.sentimentOptional boolean. The default value is false.

Enables sentiment analysis during the real-time conversation.
config.speechRecognitionOptional object. If not included, the default values for the object fields are used.

Stores fields that are used to configure the audio connection for the speaker in the real-time conversation. The speechRecognition object includes the following fields:


  • encoding

  • sampleRateHertz

The fields are described in this table.
config.speechRecognition.encodingOptional string. The default value is LINEAR16.

The encoding of the audio stream for the speaker in the real-time conversation. Valid values are LINEAR16, FLAC, MULAW, and OPUS.
config.speechRecognition.sampleRateHertzOptional number. The default value is 16000.

The sample rate of the audio stream for the speaker in the real-time conversation. Valid values depend on the audio encoding.

For LINEAR16, valid values are sample rates from 8000 to 48000.

For FLAC, valid values are sample rates 16000 or greater.

For MULAW, the valid value is a sample rate of 8000.

For OPUS, valid values are sample rates from 16000 to 48000.
customVocabularyOptional array. No default value.

An array of strings containing vocabulary specific to your company, products, or phrases.
disconnectOnStopRequestOptional boolean. The default value is true.

Overrides the normal behavior of stop_request. Normally, when you send a stop_request message, the Streaming API ends the real-time conversation.

When disconnectOnStopRequest is false and you send a stop_request message, only audio recognition is stopped. The conversation remains active for a default of 1800 seconds. Send a start_request to restart audio recognition.

When this behavior is enabled, you can stop and start the Streaming API processing without dropping the WebSocket connection, so that you can stop and resume the processing in the middle of a call and optimize the Streaming API usage costs.
disconnectOnStopRequestTimeoutOptional number. The default value is 1800.

Overrides the idle timeout for the conversation. Normally, if a WebSocket connection to the Streaming API is idle for 30 minutes (1800 seconds), the connection is automatically closed.

Valid values are from 0 to 1800 seconds. If the value is 0, the WebSocket connection is dropped when stop_request is received.
insightTypesOptional array. No default value.

The types of insights to identify. Valid values are question and action_item. For example: "insightTypes": ["question", "action_item"].

If an insight is detected, the Streaming API sends an insight_response message. For more information, see Response reference.

If insightTypes is not included in the start_request message, insights are not identified and no insight events are returned by the Streaming API.
noConnectionTimeoutOptional number. The default value is 0.

The buffer time (in seconds) during which the WebSocket API connection stays open even if there’s no Streaming API connection active for that duration. This allows the Speaker to reconnect to the same meeting with the same Subscribers if they lost the connection previously.

For example, when this parameter is set to noConnectionTimeout = 600 secs and if there is no graceful termination using stop_request message sent explicitly when there just one WebSocket connection, the connectionId and conversationId are kept valid for 600 seconds before finalizing the connection, after which connectionId will be not available to subscribe and conversationId will have all the last known information associated with it.
speakerOptional object.

Stores fields that are used to configure the speaker. The speaker object includes the following fields:


  • name

  • userId

The fields are described in this table.
speaker.nameOptional string. No default value.

The name of the speaker starting or connecting to the real-time conversation.
speaker.userIdOptional string. No default value.

A unique identifier for the speaker starting or connecting to the real-time conversation, such as an email address or UUID.
trackersOptional object. If not included, the default values for the object fields are used.

Stores fields that are used to configure detecting trackers in the real-time conversation. The trackers object includes the following fields:


  • enableAllTrackers

  • interimResults

The fields are described in this table.
trackers.enableAllTrackersOptional boolean. The default value is true.

If a tracker is detected, the Streaming API sends a tracker_response message. For more information, see Response reference.
trackers.interimResultsOptional boolean. The default value is false.
actionsOptional array. The default value is [].

An array of action objects. Each action object has name and parameters fields. Symbl.ai support 2 actions:

1. generateCallScore action - Triggers call score processing after the stop_request is sent.
2. generateInsightsUI action - Triggers Insights UI processing after the stop_request is sent. Refer the example start_request for code sample.

Example message

{
    "type": "start_request",
    "config": {
      "confidenceThreshold": 0.7,
      "detectEntities": true,
      "languageCode": "en-US",
      "meetingTitle": "Streaming API Meeting",
      "sentiment": true,
      "speechRecognition": {
        "encoding": "LINEAR16",
        "sampleRateHertz": 16000
      },
      "trackers": {
        "enableAllTrackers": true,
      	"interimResults": true
   	  }
    },
    "actions": [{
      "name": "sendSummaryEmail",
      "parameters": {
         "emails": [
            "[email protected]",
            "[email protected]"
          ]
       }
    }],
    "customVocabulary": ["example brand", "example product"],
    "disconnectOnStopRequest": false,
    "disconnectOnStopRequestTimeout": 600,
    "insightTypes": ["question", "action_item"],
    "noConnectionTimeout": 600,
    "speaker": {
      "name": "Example User",
      "userId": "[email protected]"
    },
    "actions": [
      {
        "name": "generateCallScore",
        "parameters": {
           "conversationType": "string",
           "salesStage": "string",
           "scorecardId": "string",
           "prospectName": "string",
           "callScoreWebhookUrl": "string"
        }
      },
      {
        "name": "generateInsightsUI",
        "parameters": {
           "prospectName": "string"
        }
      }
    ]
}

stop_request

Use the stop_request message to stop audio recognition and end a real-time conversation. When the conversation ends after a stop_request is received, the Streaming API returns a conversation_completed message. For more information, see Response reference.

{
  "type": "stop_request"
}

The following table describes the message fields.

FieldDescription
typeRequired string.

The value must be stop_request.

🚧

The stop_request message does not automatically close your WebSocket connection. You should close the WebSocket connection after receiving a successful conversation_completed response from the Streaming API.

The default behavior of stop_request and the Streaming API can be overridden using the disconnectOnStopRequest, disconnectOnStopRequestTimeout, and noConnectionTimeout fields of start_request.

Normally, when the Streaming API receives a stop_request message, audio recognition ends and then the conversation stops. When disconnectOnStopRequest is set to false in the start_request, only recognition is stopped. The conversation persists and recognition can be restarted with another start_request.

modify_request

Use the modify_request message to update the configuration of a real-time conversation that is already in progress. Any of the fields in the config object of the start_request can be changed using a modify_request.

{
  "type": "modify_request",
  "confidenceThreshold": number,
  "detectEntities": boolean,
  "languageCode": string,
  "meetingTitle": string,
  "sentiment": boolean,
  "speechRecognition": {
    "encoding": string,
    "sampleRateHertz": number
  }
}

The following table describes the message fields. All fields are optional except the required type field. If a field is not included, the current value for that field remains unchanged.

FieldDescription
typeRequired string.

The value must be modify_request.
confidenceThresholdOptional number.

Minimum confidence score that must be met for a detected insight to be considered valid and returned by the Streaming API. The value must be a number from 0.5 to 1.0.

Generally, a higher confidence value means the insights from the Streaming API are higher quality but fewer are returned.
detectEntitiesOptional boolean.

Enables the detection of custom and managed entities, such as PII, PCI, and PHI data. For more information about the Entity Detection feature, see Entity Detection. To enable the Entity Detection feature, the value must be true.

If an entity is detected, the Streaming API sends an entity_response message. For more information, see Response reference.
languageCodeOptional string.

The language code that corresponds to the language of the real-time conversation. Valid values are en-US and es-ES.
meetingTitleOptional string.

A name for the conversation.
sentimentOptional boolean.

Enables sentiment analysis during the real-time conversation.
speechRecognitionOptional object.

Stores fields that are used to configure the audio connection for the speaker in the real-time conversation. The speechRecognition object includes the following fields:


  • encoding

  • sampleRateHertz


The fields are described in this table.
speechRecognition.encodingOptional string.

The encoding of the audio stream for the speaker in the real-time conversation. Valid values are LINEAR16, FLAC, MULAW, and OPUS.
speechRecognition.sampleRateHertzOptional number.

The sample rate of the audio stream for the speaker in the real-time conversation. Valid values depend on the audio encoding.

For LINEAR16, valid values are sample rates from 8000 to 48000.

For FLAC, valid values are sample rates 16000 or greater.

For MULAW, the valid value is a sample rate of 8000.

For OPUS, valid values are sample rates from 16000 to 48000.

Example message

{
  "type": "modify_request",
  "confidenceThreshold": 0.5,
  "detectEntities": false,
  "languageCode": "en-US",
  "meetingTitle": "Updated Streaming API Meeting",
  "sentiment": false,
  "speechRecognition": {
    "encoding": "FLAC",
    "sampleRateHertz": 44100
  }
}

bookmark_request

Use the bookmark_request message to create, update, or delete a bookmark during a real-time conversation. The format of the JSON message depends on the bookmark operation that you want to perform:

bookmark_request: create

{
  "type": "bookmark_request",
  "operation": "create",
  "label": string,
  "description": string,
  "user": {
    "name": string,
    "userId": string,
    "email": string
  },
  "beginTimeOffset": number,
  "duration": number
}

The following table describes the message fields. With the exception of description, all fields are required.

FieldDescription
typeRequired string.

The value must be bookmark_request.
operationRequired string.

The value must be create to create a bookmark.

Other possible operations are update and delete.
labelRequired string.

Short label for a bookmark. Can be the same value as other bookmarks.
descriptionOptional string.

Description of the contents of the bookmark.
userRequired object.

Describes the user that creates the bookmark.
user.nameRequired string.

Name of the user that creates the bookmark.
user.userIdRequired string.

Unique ID for the user that creates the bookmark. For example, the user’s email address.
user.emailRequired string.

Email address of the user that creates the bookmark.
beginTimeOffsetRequired number.

In seconds, an amount of time from the beginning of the conversation before the bookmark starts including messages. Only messages that fall after the offset are included in the bookmark.
durationRequired number.

In seconds, an amount of time from the offset. Messages that occur during the duration are included in the bookmark.
Example message
{
  "type": "bookmark_response",
  "operation": "create",
  "id": "6428584355823616",
  "label": "pain point",
  "description": "Customer found the interface difficult to use.",
  "user": {
    "name": "natalie",
    "userId": "[email protected]",
    "email": "[email protected]"
  },
  "beginTimeOffset": 10,
  "duration": 15
}

bookmark_request: update

{
  "type": "bookmark_request",
  "operation": "update",
  "label": string,
  "description": string,
  "user": {
    "name": string,
    "userId": string,
    "email": string
  },
  "beginTimeOffset": number,
  "duration": number
}

The following table describes the message fields. With the exception of description, all fields are required. However, if description is empty, any existing description is discarded when the bookmark is updated.

FieldDescription
typeRequired string.

The value must be bookmark_request.
operationRequired string.

The value must be update to create a bookmark.

Other possible operations are create and delete.
labelRequired string.

Short label for a bookmark. Can be the same value as other bookmarks.
descriptionOptional string.

Description of the contents of the bookmark.
userRequired object.

Describes the user that creates the bookmark.
user.nameRequired string.

Name of the user that creates the bookmark.
user.userIdRequired string.

Unique ID for the user that creates the bookmark. For example, the user’s email address.
user.emailRequired string.

Email address of the user that creates the bookmark.
beginTimeOffsetRequired number.

In seconds, an amount of time from the beginning of the conversation before the bookmark starts including messages. Only messages that fall after the offset are included in the bookmark.
durationRequired number.

In seconds, an amount of time from the offset. Messages that occur during the duration are included in the bookmark.
Example message
{
  "type": "bookmark_response",
  "operation": "update",
  "id": "6428584355823616",
  "label": "pain point",
  "description": "Customer found the interface difficult to use.",
  "user": {
    "name": "natalie",
    "userId": "[email protected]",
    "email": "[email protected]"
  },
  "beginTimeOffset": 10,
  "duration": 15
}

bookmark_request: delete

{
  "type": "bookmark_request",
  "operation": "delete",
  "id": string
}

The following table describes the message fields. All fields are required.

FieldDescription
typeRequired string.

The value must be bookmark_request.
operationRequired string.

The value must be update to create a bookmark.

Other possible operations are create and update.
idRequired string.

ID of the bookmark to delete.
Example message
{
  "type": "bookmark_response",
  "operation": "delete",
  "id": "6428584355823616",
  "label": "pain point",
  "description": "Customer found the interface difficult to use.",
  "user": {
    "name": "natalie",
    "userId": "[email protected]",
    "email": "[email protected]"
  },
  "beginTimeOffset": 10,
  "duration": 15
}

Binary messages with audio

The client needs to send data to the audio service by converting the audio stream into a series of audio chunks. Each chunk of audio carries a segment of audio that needs to be processed. The maximum size of a single audio chunk is 8,192 bytes.

Set up trackers

To detect trackers on a connection, you have two options -

  1. Enable all trackers
    To enable all trackers that you have defined in your account, set enableAllTrackers flag to true in the start_request message.

  2. Select specific trackers
    To enable specific trackers on a connection, include the trackers array in the start_request message. You can add existing tracker IDs or define new trackers within this array. Please note that the new trackers you define will only apply to the current connection and won't be persisted in your account. It's recommended to define trackers using the Management API and include IDs in the trackers array.

Example message
{
  "type": "start_request",
  "config": {
    "confidenceThreshold": 0.7,
    "detectEntities": true,
    "languageCode": "en-US",
    "meetingTitle": "Streaming API Meeting",
    "sentiment": true,
    "speechRecognition": {
      "encoding": "LINEAR16",
      "sampleRateHertz": 16000
    }
  },
  "trackers": [
    {
      "id": "123456"
    },
    {
      "name": "Denial",
      "vocabulary": [
        "No",
        "never agreed to",
        "not interested"
      ]
    },
    {
      "name": "covid",
      "vocabulary": [
        "wear mask",
        "coughing",
        "fever",
        "cold",
        "trouble breathing"
      ]
    }
  ],
  "speaker": {
    "name": "Example User",
    "userId": "[email protected]"
  }
}

Real-time interim transcript

Interim results are intermediate transcript predictions retrieved during a real-time conversation that are likely to change before the automatic speech recognition (ASR) engine returns its final results.

During a streaming conversation, message: recognition_result continuously adds recognized text and data to the transcript.

You can also use the conversation ID to view real-time interim results of the transcript. During a conversation, use the Get messages operation to view interim transcript results. Use the same operation to view the final transcript after a conversation.


Response reference

After you open a WebSocket connection and send a start_request, the Streaming API begins sending various responses. The Streaming API sends messages as responses to events such as when:

  • The conversation starts and ends.
  • Audio recognition starts and stops.
  • The system detects trackers, entities, or insights.

Every response from the Streaming API includes a type field and one or more additional objects. The format of the additional objects depends on the type field. This section is organized by type. Some response types, such as message, have additional subtypes in which the format may vary. The response formats are described in their respective sections.

The following types are returned by the Streaming API:

message

When the response type is message, the response body includes a message object. The message object has a type field (referred to in this section as message.type ). The message.type field determines the format of the rest of the message object.

{
  "type": "message",
  "message": {
    "type": string,
    ...
  }
}

The following values for message.type are returned by the Streaming API:

message: started_listening

The started_listening message indicates that the Streaming API is ready to start a real-time conversation and receive audio streams. This is the first response from the Streaming API after you send a start_request.

{
  "type": "message",
  "message": {
    "type": "started_listening"
  }
}

The following table describes the response fields.

FieldDescription
typeThe value is message. This value indicates that the response also includes a message object.

Other possible values are message_response, entity_response, tracker_response, topic_response, bookmark_response, insight_response, request_modified.
messageThe message object is included in the response when the response type is message.
message.typeThe value is started_listening. This value indicates that a start_request has been received by the Streaming API and a new real-time conversation is being started. If the start_request is valid and the real-time conversation is started successfully, this response is followed by a conversation_created response.

Other possible values are conversation_created, recognition_started, recognition_result, recognition_stopped, conversation_completed.

message: conversation_created

The conversation_created message indicates that a real-time conversation has started. This response usually follows a started_listening response. The response includes the conversation ID for the real-time conversation. The conversation ID can be used to manage the conversation and obtain conversation intelligence after the conversation is completed.

{
  "type": "message",
  "message": {
    "type": "conversation_created",
    "data": {
      "conversationId": string
    }
  }
}

The following table describes the response fields.

FieldDescription
typeThe value is message, which indicates this response also includes a message object.

Other possible values are message_response, entity_response, tracker_response, topic_response, bookmark_response, insight_response, request_modified.
messageThe message object is included in the response when the response type is message.
message.typeThe value is conversation_created. This value indicates that the Streaming API has successfully created a real-time conversation that can be joined or subscribed to.

Other possible values are started_listening, recognition_started, recognition_result, recognition_stopped, conversation_completed.
message.dataProvides a field for the conversation ID. The data object is included in the response when message.type is conversation_created.
message.data.conversationIdThe conversation ID for the real-time conversation.
Example response
{
  "type": "message",
  "message": {
    "type": "conversation_created",
    "data": {
      "conversationId": "5059468837519360"
    }
  }
}

message: recognition_started

The recognition_started message indicates that the Streaming API is now ready to process audio for the real-time conversation. After you receive this request, the Streaming API is ready to receive binary messages that contain supported audio data. When you send binary messages, the Streaming API processes the audio data to provide transcription and conversation intelligence.

{
  "type": "message",
  "message": {
    "type": "recognition_started",
    "data": {
      "conversationId": string
    }
  }
}

The following table describes the response fields.

FieldDescription
typeThe value is message, which indicates this response also includes a message object.

Other possible values are message_response, entity_response, tracker_response, topic_response, bookmark_response, insight_response, request_modified.
messageThe message object is included in the response when the response type is message.
message.typeThe value is recognition_started. This value indicates that the Streaming API has successfully created a real-time conversation that can be joined or subscribed to.

Other possible values are started_listening, conversation_created, recognition_result, recognition_stopped, conversation_completed.
message.dataProvides a field for the conversation ID.
message.data.conversationIdThe conversation ID for the real-time conversation.
Example response
{
  "type": "message",
  "message": {
    "type": "recognition_started",
    "data": {
      "conversationId": "5059468837519360"
    }
  }
}

message: recognition_result

The recognition_result message is returned when audio is processed by the Streaming API. The message contains speaker information, fragments of text, and other data related to the stream. The message represents a partial transcription result. As audio is processed, the data is concatenated and further recognition_result messages containing partial results are sent until a complete sentence is identified. For responses that contain complete sentences, see message_response.

{
  "type": "message",
  "message": {
    "type": "recognition_result",
    "isFinal": boolean,
    "payload": {
      "raw": {
        "alternatives": [
          {
            "words": [
              {
                "word": string,
                "startTime": {
                  "seconds": string,
                  "nanos": string
                },
                "endTime": {
                  "seconds": string,
                  "nanos": string
                }
              },
              ...,
            ],
            "transcript": string,
            "confidence": number
          }
        ]
      }
    },
    "punctuated": {
      "transcript": string
    },
    "user": {
      "userId": string,
      "name": string,
      "id": string
    }
  },
  "timeOffset": number
}

The following table describes the response fields.

FieldDescription
typeThe value is message, which indicates this response also includes a message object.

Other possible values are message_response, entity_response, tracker_response, topic_response, bookmark_response, insight_response, request_modified.
messageThe message object is included in the response when the response type is message.
message.typeThe value is recognition_result. This value indicates that the Streaming API has successfully received and processed your audio data.

Other possible values are started_listening, conversation_created, recognition_started, recognition_stopped, conversation_completed.
message.isFinalWhen this value is false, it means the Streaming API is still processing incoming audio. Audio data will continue to be added to this segment.

When this value is true, the response contains the final processed text that will also be delivered in a message_response.
message.payloadContains the raw object.
message.payload.rawContains the alternatives object.
message.payload.raw.alternativesContains fields that detail the raw content currently being processed from audio by the Streaming API.
message.payload.raw.alternatives.wordsContains words and start and end times for determining the audio length of the corresponding words.

The words array only contains items if message.isFinal is true.
message.payload.raw.alternatives.words.wordAn individual word in the processed audio.
message.payload.raw.alternatives.words.startTimeContains the start time, split between seconds and nanoseconds. The exact start time is startTime.seconds + startTime.nanos.
message.payload.raw.alternatives.words.startTime.secondsThe start time of the spoken word, given as an offset from the beginning of the conversation in seconds.
message.payload.raw.alternatives.words.startTime.nanosThe additional offset from startTime.seconds in nanoseconds.
message.payload.raw.alternatives.words.endTimeContains the end time, split between seconds and nanoseconds. The exact end time is endTime.seconds + endTime.nanos.
message.payload.raw.alternatives.words.endTime.secondsThe end time of the spoken word, given as an offset from the beginning of the conversation in seconds.
message.payload.raw.alternatives.words.endTime.nanosThe additional offset from endTime.seconds in nanoseconds.
message.payload.raw.alternatives.transcriptThe text transcript of the processed audio.
message.payload.raw.alternatives.confidenceThe confidence of the Streaming API that the transcript is an accurate text representation of the audio.
message.punctuatedContains the transcript field.
message.punctuated.transcriptA punctuated version of text transcript of the audio.
message.userContains fields that describe the user who was the speaker.
message.user.userIdThe user ID that was provided for the speaker.
message.user.nameThe name that was provided for the speaker.
message.user.idAn ID generated by the Streaming API to uniquely identify the user.
timeOffsetIn milliseconds, the offset of the message from the beginning of the conversation. For example, a timeOffset of 60000 indicates the phrase was spoken one minute into the conversation.
Example response
{
  "type": "message",
	"message": {
    "type": "recognition_result",
    "isFinal": true,
    "payload": {
      "raw": {
        "alternatives": [
          {
            "words": [
              {
                "word": "I",
                "startTime": {
                  "seconds": "98",
                  "nanos": "628000000"
                },
                "endTime": {
                  "seconds": "98",
                  "nanos": "728000000"
                }
              },
              {
                "word": "have",
                "startTime": {
                  "seconds": "98",
                  "nanos": "728000000"
                },
                "endTime": {
                  "seconds": "98",
                  "nanos": "928000000"
                }
              },
              {
                "word": "a",
                "startTime": {
                  "seconds": "98",
                  "nanos": "928000000"
                },
                "endTime": {
                  "seconds": "99",
                  "nanos": "027999999"
                }
              },
              {
                "word": "question.",
                "startTime": {
                  "seconds": "99",
                  "nanos": "027999999"
                },
                "endTime": {
                  "seconds": "99",
                  "nanos": "628000000"
                }
              },
              {
                "word": "What's",
                "startTime": {
                  "seconds": "99",
                  "nanos": "728000000"
                },
                "endTime": {
                  "seconds": "99",
                  "nanos": "928000000"
                }
              },
              {
                "word": "your",
                "startTime": {
                  "seconds": "99",
                  "nanos": "928000000"
                },
                "endTime": {
                  "seconds": "100",
                  "nanos": "027999999"
                }
              },
              {
                "word": "name?",
                "startTime": {
                  "seconds": "100",
                  "nanos": "027999999"
                },
                "endTime": {
                  "seconds": "100",
                  "nanos": "527999999"
                }
              }
            ],
            "transcript": "I have a question. What's your name?",
            "confidence": 0.9094458818435669
          }
        ]
      }
    },
    "punctuated": {
      "transcript": "I have a question. What's your name?"
    },
    "user": {
      "userId": "[email protected]",
      "name": "Tony Stark",
      "id": "b421075a-992f-41d6-bd55-edeb8aad5c22"
    }
  },
  "timeOffset": 104301
}

message: recognition_stopped

The recognition_stopped message indicates that the Streaming API is no longer processing binary messages that contain audio data. This is usually the first response from the Streaming API after you send a stop_request.

{
  "type": "message",
  "message": {
    "type": "recognition_stopped"
  }
}

The following table describes the response fields.

FieldDescription
typeThe value is message, which indicates this response also includes a message object.

Other possible values are message_response, entity_response, tracker_response, topic_response, bookmark_response, insight_response, request_modified.
messageThe message object is included in the response when the response type is message.
message.typeThe value is recognition_stopped. This value indicates that a stop_request has been received by the Streaming API and audio recognition for the speaker provided in the original start_request has stopped. By default, this response is followed by a conversation_completed response.

Other possible values are started_listening, conversation_created, recognition_started, recognition_result, conversation_completed.

message: conversation_completed

The conversation_completed message indicates that the real-time conversation created with Streaming API is now closed. It contains the conversation ID of the real-time conversation and a URL to an automatically-generated Summary UI for the conversation. This is the last response from the Streaming API after you send a stop_request.

{
  "type": "message",
  "message": {
    "summaryUrl": string,
    "conversationId": string,
    "type": "conversation_completed"
  }
}

The following table describes the response fields.

FieldDescription
typeThe value is message, which indicates this response also includes a message object.

Other possible values are message_response, entity_response, tracker_response, topic_response, bookmark_response, insight_response, request_modified.
messageThe message object is included in the response when the response type is message.
message.summaryUrlA URL for a Summary UI generated for the real-time conversation.
message.conversationIdThe conversation ID for the real-time conversation.
message.typeThe value is conversation_completed. This value indicates that a stop_request has been received by the Streaming API and that the current real-time conversation has ended. Speakers can no longer connect to the conversation and the conversation can no longer be subscribed to with the Subscribe API.

Other possible values are started_listening, conversation_created, recognition_started, recognition_result, recognition_stopped.
Example response
{
  "type": "message",
  "message": {
    "summaryUrl": "https://meetinginsights.symbl.ai/meeting/#/eyJ1c2iOiJ0b2N0YXJraW5kdXN0cmllcy5jb20iLCJuYW1lIjoiVG9ueSBTdGFyayIsInNlc3Npb25JZCI6IjU3MjMzNDI4MTI3NDE2MzIifQ==?o=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjU3MjMzNDI4MTI3NDE2MzIiLCJpYXQiOjE2NjczMedF8TCF4h_SNy8gmweA4W9qsbQ95iyai4c4",
    "conversationId": "5723342812741632",
    "type": "conversation_completed"
  }
}

message_response

When the message.type value is message_response it indicates the final transcript. For a response type of message, the response body includes a message object with a type field referred to as message.type.

{
  "type": "message_response",
  "messages": [
    {
      "from": {
        "id": string,
        "name": string,
        "userId": string
      },
      "payload": {
        "content": string,
        "contentType": "text/plain"
      },
      "id": string,
      "channel": {
        "id": "realtime-api"
      },
      "metadata": {
        "disablePunctuation": true,
        "originalContent": string,
        "words": string,
        "originalMessageId": string
      },
      "dismissed": false,
      "duration": {
        "startTime": string,
        "endTime": string,
        "timeOffset": number,
        "duration": number
      },
      "entities": []
    },
    ...
  ],
  "sequenceNumber": number
}

Example response with sentiment analysis

{
  "type": "message_response",
  "messages": [
    {
      "from": {
        "id": "b421075a-992f-41d6-bd55-edeb8aad5c22",
        "name": "Tony Stark",
        "userId": "[email protected]"
      },
      "payload": {
        "content": "Let us see if the sentiments are working correctly or not.",
        "contentType": "text/plain"
      },
      "id": "f34f7b34-68ac-43ef-b2ab-42d0c5e4c44c",
      "channel": {
        "id": "realtime-api"
      },
      "metadata": {
        "disablePunctuation": true,
        "timezoneOffset": 480,
        "originalContent": "Let us see if the sentiments are working correctly or not.",
        "words": "[{\"word\":\"Let\",\"startTime\":\"2022-12-15T09:42:13.742Z\",\"endTime\":\"2022-12-15T09:42:14.042Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"us\",\"startTime\":\"2022-12-15T09:42:14.042Z\",\"endTime\":\"2022-12-15T09:42:14.042Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"see\",\"startTime\":\"2022-12-15T09:42:14.042Z\",\"endTime\":\"2022-12-15T09:42:14.142Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"if\",\"startTime\":\"2022-12-15T09:42:14.142Z\",\"endTime\":\"2022-12-15T09:42:14.242Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"the\",\"startTime\":\"2022-12-15T09:42:14.242Z\",\"endTime\":\"2022-12-15T09:42:14.341Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"sentiments\",\"startTime\":\"2022-12-15T09:42:14.341Z\",\"endTime\":\"2022-12-15T09:42:14.842Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"are\",\"startTime\":\"2022-12-15T09:42:14.842Z\",\"endTime\":\"2022-12-15T09:42:14.942Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"working\",\"startTime\":\"2022-12-15T09:42:14.942Z\",\"endTime\":\"2022-12-15T09:42:15.142Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"correctly\",\"startTime\":\"2022-12-15T09:42:15.142Z\",\"endTime\":\"2022-12-15T09:42:15.542Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"or\",\"startTime\":\"2022-12-15T09:42:15.542Z\",\"endTime\":\"2022-12-15T09:42:15.642Z\",\"timeOffset\":null,\"duration\":null},{\"word\":\"not.\",\"startTime\":\"2022-12-15T09:42:15.642Z\",\"endTime\":\"2022-12-15T09:42:15.942Z\",\"timeOffset\":null,\"duration\":null}]",
        "originalMessageId": "4b2b861c-b7d9-4826-9502-2162906d0e7b"
      },
      "dismissed": false,
      "duration": {
        "startTime": "2022-12-15T09:42:13.742Z",
        "endTime": "2022-12-15T09:42:15.942Z"
      },
      "sentiment": {
        "polarity": {
          "score": -0.3
        },
        "suggested": "neutral"
      }
    }
  ],
  "sequenceNumber": 0
}

Example response without sentiment analysis

{
  "type": "message_response",
  "messages": [
    {
      "from": {
        "id": "b421075a-992f-41d6-bd55-edeb8aad5c22",
        "name": "Tony Stark",
        "userId": "[email protected]"
      },
      "payload": {
        "content": "I have a question.",
        "contentType": "text/plain"
      },
      "id": "f34f7b34-68ac-43ef-b2ab-42d0c5e4c44c",
      "channel": {
        "id": "realtime-api"
      },
      "metadata": {
        "disablePunctuation": true,
        "originalContent": "I have a question.",
        "words": "[{\"word\":\"I\",\"startTime\":\"2022-11-01T21:46:48.664Z\",\"endTime\":\"2022-11-01T21:46:48.764Z\",\"timeOffset\":98.63,\"duration\":0.1},{\"word\":\"have\",\"startTime\":\"2022-11-01T21:46:48.764Z\",\"endTime\":\"2022-11-01T21:46:48.964Z\",\"timeOffset\":98.73,\"duration\":0.2},{\"word\":\"a\",\"startTime\":\"2022-11-01T21:46:48.964Z\",\"endTime\":\"2022-11-01T21:46:49.063Z\",\"timeOffset\":98.93,\"duration\":0.1},{\"word\":\"question.\",\"startTime\":\"2022-11-01T21:46:49.063Z\",\"endTime\":\"2022-11-01T21:46:49.664Z\",\"timeOffset\":99.03,\"duration\":0.6}]",
        "originalMessageId": "f34f7b34-68ac-43ef-b2ab-42d0c5e4c44c"
      },
      "dismissed": false,
      "duration": {
        "startTime": "2022-11-01T21:46:48.664Z",
        "endTime": "2022-11-01T21:46:49.664Z",
        "timeOffset": 98.63,
        "duration": 1
      },
      "entities": []
    },
    {
      "from": {
        "id": "b421075a-992f-41d6-bd55-edeb8aad5c22",
        "name": "Tony Stark",
        "userId": "[email protected]"
      },
      "payload": {
        "content": "What's your name?",
        "contentType": "text/plain"
      },
      "id": "5697898c-e102-4bf5-b4d3-9bd591089fb2",
      "channel": {
        "id": "realtime-api"
      },
      "metadata": {
        "disablePunctuation": true,
        "originalContent": "What's your name?",
        "words": "[{\"word\":\"What's\",\"startTime\":\"2022-11-01T21:46:49.764Z\",\"endTime\":\"2022-11-01T21:46:49.964Z\",\"timeOffset\":99.73,\"duration\":0.2},{\"word\":\"your\",\"startTime\":\"2022-11-01T21:46:49.964Z\",\"endTime\":\"2022-11-01T21:46:50.063Z\",\"timeOffset\":99.93,\"duration\":0.1},{\"word\":\"name?\",\"startTime\":\"2022-11-01T21:46:50.063Z\",\"endTime\":\"2022-11-01T21:46:50.563Z\",\"timeOffset\":100.03,\"duration\":0.5}]",
        "originalMessageId": "5697898c-e102-4bf5-b4d3-9bd591089fb2"
      },
      "dismissed": false,
      "duration": {
        "startTime": "2022-11-01T21:46:49.764Z",
        "endTime": "2022-11-01T21:46:50.563Z",
        "timeOffset": 99.73,
        "duration": 0.8
      },
      "entities": []
    }
  ],
  "sequenceNumber": 8
}

entity_response

When the message.type value is entity_response it indicates that this response includes the entities detected in the conversation. For a response type of message, the response body includes a message object with a type field referred to as message.type.

{
  "type": "entity_response",
  "entities": [
    {
      "type": string,
      "subType": string,
      "category": string,
      "matches": [
        {
          "detectedValue": string,
          "messageRefs": [
            {
              "id": string,
              "startTime": string,
              "endTime": string,
              "text": string,
              "offset": number
            }
          ]
        }
      ]
    }
  ],
  "sequenceNumber": number
}

Example response

{
  "type": "entity_response",
  "entities": [
    {
      "type": "General",
      "subType": "Time",
      "category": "Managed",
      "matches": [
        {
          "detectedValue": "9 AM",
          "messageRefs": [
            {
              "id": "05f92ef1-6ce9-4061-bde6-ad4e92efabce",
              "startTime": "2022-11-01T21:46:42.964Z",
              "endTime": "2022-11-01T21:46:45.063Z",
              "text": "Let us follow up tomorrow at 9 AM.",
              "offset": 29
            }
          ]
        }
      ]
    }
  ],
  "sequenceNumber": 7
}

tracker_response

When the message.type value is tracker_response it indicates the that this response includes the trackers detected in the conversation. For a response type of message, the response body includes a message object with a type field referred to as message.type.

{
  "type": "tracker_response",
  "isFinal": boolean,
  "trackers": [
    {
      "name": string,
      "matches": [
        {
          "type": "vocabulary",
          "value": string,
          "messageRefs": [
            {
              "id": string,
              "text": string,
              "offset": number
            }
          ],
          "insightRefs": []
        }
      ]
    }
  ],
  "sequenceNumber": number
}

Example response

{
  "type": "tracker_response",
  "isFinal": true,
  "trackers": [
    {
      "name": "test",
      "matches": [
        {
          "type": "vocabulary",
          "value": "test",
          "messageRefs": [
            {
              "id": "7c4b04c6-cc86-4ef8-8a97-a26c1eb9a21f",
              "text": "It was a test.",
              "offset": 9
            }
          ],
          "insightRefs": []
        }
      ]
    }
  ],
  "sequenceNumber": 1
}

topic_response

When the message.type value is topic_response it indicates that this response includes the topics detected in the conversation. For a response type of message, the response body includes a message object with a type field referred to as message.type.

{
  "type": "topic_response",
  "topics": [
    {
      "id": string,
      "messageReferences": [
        {
          "id": string
        }
      ],
      "phrases": string,
      "rootWords": [
        {
          "text": string
        }
      ],
      "score": number,
      "type": "topic",
      "messageIndex": number
    },
    ...
  ]
}

Example response

{
  "type": "topic_response",
  "topics": [
    {
      "id": "89dceeba-5a2e-11ed-9ce3-3e4ff3583f87",
      "messageReferences": [
        {
          "id": "d988a904-6435-44f7-8466-85c9027f1d03"
        }
      ],
      "phrases": "fast",
      "rootWords": [
        {
          "text": "Fast"
        }
      ],
      "score": 0.54,
      "type": "topic",
      "messageIndex": 0
    },
    {
      "id": "89dd3712-5a2e-11ed-9ce3-3e4ff3583f87",
      "messageReferences": [
        {
          "id": "7c4b04c6-cc86-4ef8-8a97-a26c1eb9a21f"
        }
      ],
      "phrases": "test",
      "rootWords": [
        {
          "text": "test"
        }
      ],
      "score": 0.09,
      "type": "topic",
      "messageIndex": 3
    },
    {
      "id": "89dc991a-5a2e-11ed-9ce3-3e4ff3583f87",
      "messageReferences": [
        {
          "id": "b0870702-2eb1-4ca1-8a29-e1ff2265a203"
        }
      ],
      "phrases": "bookmark",
      "rootWords": [
        {
          "text": "bookmark"
        }
      ],
      "score": 0.9,
      "type": "topic",
      "messageIndex": 5
    },
    {
      "id": "ac18f014-5a2e-11ed-800a-fe9e07a0b21f",
      "messageReferences": [
        {
          "id": "5a12fe88-c436-4b23-a68a-296bfe524cb4"
        }
      ],
      "phrases": "expand",
      "rootWords": [
        {
          "text": "Expand"
        }
      ],
      "score": 0.9,
      "type": "topic",
      "messageIndex": 7
    },
    {
      "id": "ac191814-5a2e-11ed-800a-fe9e07a0b21f",
      "messageReferences": [
        {
          "id": "09e94c7c-aea6-4b0b-989b-1d7b8d24c28e"
        }
      ],
      "phrases": "action item",
      "rootWords": [
        {
          "text": "action"
        }
      ],
      "score": 0.54,
      "type": "topic",
      "messageIndex": 8
    }
  ]
}

bookmark_response

When the message.type value is bookmark_response it indicates that this response includes the bookmarks detected in the conversation. For a response type of message, the response body includes a message object with a type field referred to as message.type.

{
  "type": "bookmark_response",
  "operation": string,
  "id": string,
  "label": string,
  "description": string,
  "user": {
    "name": string,
    "email": string,
    "userId": string
  },
  "beginTimeOffset": number,
  "duration": number
}

Example response

{
  "type": "bookmark_response",
  "operation": "create",
  "id": "4829667240968192",
  "label": "TestBookmark",
  "description": "one1",
  "user": {
    "name": "Tony Stark",
    "email": "[email protected]",
    "userId": "[email protected]"
  },
  "beginTimeOffset": 10.8,
  "duration": 12.3
}

insight_response

When the message.type value is insight_response it indicates that this response includes the insights detected in the conversation. For a response type of message, the response body includes a message object with a type field referred to as message.type.

{
  "type": "insight_response",
  "insights": [
    {
      "id": string,
      "confidence": number,
      "hints": [
        {
          "key": "definitive",
          "value": string
        },
        {
          "key": "addressedTo",
          "value": string
        },
        {
          "key": "informationScore",
          "value": string
        },
        {
          "key": "confidenceScore",
          "value": string
        },
        {
          "key": "comprehensionScore",
          "value": string
        }
      ],
      "type": "action_item",
      "assignee": {
        "id": string,
        "name": string,
        "userId": string
      },
      "dueBy": {
        "value": string
      },
      "tags": [
        {
          "type": string,
          "text": string,
          "beginOffset": number,
          "value": {
            "value": {
              "datetime": string
            }
          }
        }
      ],
      "dismissed": false,
      "payload": {
        "content": string,
        "contentType": "text/plain"
      },
      "from": {
        "id": string,
        "name": string,
        "userId": string
      },
      "entities": null,
      "messageReference": {
        "id": string
      }
    }
  ],
  "sequenceNumber": number
}

Example response

{
  "type": "insight_response",
  "insights": [
    {
      "id": "05f92ef1-6ce9-4061-bde6-ad4e92efabce",
      "confidence": 0.9308446925249736,
      "hints": [
        {
          "key": "definitive",
          "value": "true"
        },
        {
          "key": "addressedTo",
          "value": "[\"first_person_plural\"]"
        },
        {
          "key": "informationScore",
          "value": "0.8354166666666667"
        },
        {
          "key": "confidenceScore",
          "value": "0.99999975"
        },
        {
          "key": "comprehensionScore",
          "value": "0.9833906292915344"
        }
      ],
      "type": "action_item",
      "assignee": {
        "id": "b421075a-992f-41d6-bd55-edeb8aad5c22",
        "name": "Tony Stark",
        "userId": "[email protected]"
      },
      "dueBy": {
        "value": "2022-11-02T09:00:00-07:00"
      },
      "tags": [
        {
          "type": "datetime",
          "text": "tomorrow at 9 am",
          "beginOffset": 21,
          "value": {
            "value": {
              "datetime": "2022-11-02 09:00:00"
            }
          }
        }
      ],
      "dismissed": false,
      "payload": {
        "content": "We need to follow up tomorrow at 9 AM.",
        "contentType": "text/plain"
      },
      "from": {
        "id": "b421075a-992f-41d6-bd55-edeb8aad5c22",
        "name": "Tony Stark",
        "userId": "[email protected]"
      },
      "entities": null,
      "messageReference": {
        "id": "05f92ef1-6ce9-4061-bde6-ad4e92efabce"
      }
    }
  ],
  "sequenceNumber": 7
}

request_modified

When the message.type value is request_modified it indicates that there has been a request to modify original transcript. The payload of the request are the requested changes. The following example requests changes to trackers.

{  "type": "request_modified",  
  "payload": {  
    "trackers": \[  
      {  
        "name": "A",  
        "vocabulary": [  
          "one",  
          "two"  
        ]  
      },  
      {  
        "name": "test",  
        "vocabulary": [  
          "test"  
        ]  
      },  
      {  
        "name": "Promotion Mention Tracker",  
        "vocabulary": [  
          "We have a special promotion going on if you book this before",  
          "I can offer you a discount of 10 20 percent you being a new customer for us",  
          "We have our month special this month",  
          "We have a sale right now on"  
        ]  
      },  
      {  
        "name": "Test",  
        "vocabulary": [  
          "one",  
          "two",  
          "three",  
          "four",  
          "five"  
        ]  
      },  
      {  
        "name": "B",  
        "vocabulary": [  
          "one",  
          "two"  
        ]  
      }  
    ]  
  }  
}