Code snippets - Streaming API

Subscribe to real-time events

The Symbl.ai JavaScript SDK also lets you subscribe to real-time events when you connect to one of the Endpoints specified in the above sections. These include:

  • Real-Time Transcription
  • Real-Time Insights
  • Real-Time Messages
  • Real-Time Intents

The below example shows how to achieve this:

Initialize SDK

const {sdk, SpeakerEvent} = require("@symblai/symbl-js");

sdk.init({
    // APP_ID and APP_SECRET come from the Symbl Platform: https://platform.symbl.ai
    appId: APP_ID,
    appSecret: APP_SECRET
}).then(async () => {
    console.log('SDK initialized.');
    try {
      // You code goes here.
    } catch (e) {
        console.error(e);
    }
}).catch(err => console.error('Error in SDK initialization.', err));

Add the above lines to import and initialize the SDK. Replace the APP_ID and APP_SECRET in the code. You can find the them by signing up on the Symbl.ai platform.

Make a phone call

const connection = await sdk.startEndpoint({
    endpoint: {
        type: 'pstn', // when making a regular phone call
        phoneNumber: 'PHONE_NUMBER' // include country code
    }
});
const {connectionId} = connection;
console.log('Successfully connected. Connection Id: ', connectionId);

The above snippet makes a phone call, by calling the startEndpoint with type set to pstn and a valid US/Canada Phone Number.
You can also call in via type sip as well with the steps below remaining the same.

Subscribe to the Live Events

// Subscribe to connection using connectionId.
sdk.subscribeToConnection(connectionId, (data) => {
  const {type} = data;
  if (type === 'transcript_response') {
      const {payload} = data;

      // You get live transcription here!!
      process.stdout.write('Live: ' + payload && payload.content + '\r');

  } else if (type === 'message_response') {
      const {messages} = data;

      // You get processed messages in the transcript here!!! Real-time but not live! :)
      messages.forEach(message => {
          process.stdout.write('Message: ' + message.payload.content + '\n');
      });
  } else if (type === 'insight_response') {
      const {insights} = data;
      // See <link here> for more details on Insights
      // You get any insights here!!!
      insights.forEach(insight => {
          process.stdout.write(`Insight: ${insight.type} - ${insight.text} \n\n`);
      });
  }
});

The above snippet calls the `subscribeToConnection` which requires the `connectionId` of the call and a callback function to be passed as the second argument which will be invoked when any of the above events are available to be consumed.

The `data` received will contain `type` of the event. It can be one of `transcript_response`, `message_response`, `insight_response`.

Lets go over them one by one:

1. `transcript_response` : This contains the real-time transcription data which is availabe as soon as its detected.

2. `message_response` : This will contain the array of the transcripts of all the speakers which will be logically separated by punctuations or the speakers if [Active Speaker Events](/docs/javascript-sdk/code-snippets/active-speaker-events) are pushed.

3. `insight_response` : This will contain the array of all the insights detected in real-time. These can be [Action Items](/docs/conversation-api/action-items) or questions.

There is also a 4th type of event which is `intent_response` covered in a separate example:

### End the call

```js Node.js
// Stop call after 60 seconds to automatically.
setTimeout(async () => {
  const connection = await sdk.stopEndpoint({ connectionId });
  console.log('Stopped the connection');
  console.log('Conversation ID:', connection.conversationId);
}, 60000); // Change the 60000 with higher value if you want this to continue for more time.

To end the call gracefully, we call the stopEndpoint call to stop the call.
The code snippet above simply stops the call after 60 seconds.

And we're done! That's how you can consume real-time events using Javascript SDK!

The complete code for the example above can be found here.

Testing

Create a javascript file named app.js and copy this code into the file. Fill in the placeholder values with the proper values. Use npm to install the required libraries: npm install @symblai/symbl-js. Now in the terminal run

$ node app.js

If successful you should receive a response in the console.

📘

If you have any questions or concerns about our API, you can join our Support Slack or send us an email at [email protected]

Streaming Audio in Real-Time

This section talks about streaming the audio in real-time using the Javascript SDK. We can use this API to pass in audio via a single stream and multiple isolated streams of audio, each of which can contain one or more speaker's audio data.

🚧

If you plan to use the multiple audio streams we recommend using single streams for each speaker involved to get the most accurate of transcription and speaker separation.

You can also consume the processed results in real-time, which include:

  • Real-time transcription
  • Real-time insights (Action Items and Questions)
  • When using multiple audio streams (Each stream for 1 speaker) you also get access to speaker-separated data (including transcription and messages)

Example with Single Stream

The example below utilises the mic package to stream audio in real-time. This will be a single stream of audio obtained through mic which may have one or more than one speaker's audio.

Import required packages

const {sdk} = require('@symblai/symbl-js');
const uuid = require('uuid').v4;
// For demo purposes, we're using mic to simply get audio from microphone and pass it on to websocket connection
const mic = require('mic');

In the above snippet we import the sdk, uuid and mic npm packages. The uuid package is used for generating a unique ID to represent this stream and it's strongly recommended to use it.
The mic package is used to obtain the audio stream in real-time to pass to the SDK.

Initialise an instance of mic

const sampleRateHertz = 16000;
const micInstance = mic({
    rate: sampleRateHertz,
    channels: '1',
    debug: false,
    exitOnSilence: 6
});

We now declare the sampleRateHertz variable to specify the Sample Rate of the audio obtained from the mic.
It is imperative to use the same Sample Rate used for initialising the mic package and for passing in to the startRealtimeRequest of Javascript SDK as we will see below.
Otherwise the transcription will be completely in-accurate.

We also initialise mic with channels: '1' (mono channel) audio as currently only mono channel audio data is supported.

Initialise the Javascript SDK

// Initialize the SDK
await sdk.init({
    // APP_ID and APP_SECRET come from the Symbl Platform: https://platform.symbl.ai
    appId: APP_ID,
    appSecret: APP_SECRET,
    basePath: 'https://api.symbl.ai'
});
// Need unique Id
const id = uuid();

Next we initialise a helper function to execute our code in the async/await style. The following code snippets (including the one just above) will be a part of the same function.
We now initialise the Javascript SDK with the init call, passing in appId and appSecret which you can be obtain by signing up on Symbl.ai platform
We also initialise variable id with uuid function for the unique ID required for this stream as was also mentioned above in the import section snippet.

Call the startRealtimeRequest

// Start Real-time Request (Uses Real-time WebSocket API behind the scenes)
const connection = await sdk.startRealtimeRequest({
    id,
    insightTypes: ["action_item", "question"],
    config: {
        meetingTitle: 'My Test Meeting',
        confidenceThreshold: 0.7,
        timezoneOffset: 480, // Offset in minutes from UTC
        languageCode: "en-US",
        sampleRateHertz
    },
    speaker: {
        // Optional
        userId: 'user-identifier',
        name: 'My name'
      },
    handlers: {
        'onSpeechDetected': (data) => {
            console.log(JSON.stringify(data));
            // For live transcription
            if (data) {
                const {punctuated} = data;
                console.log('Live: ', punctuated && punctuated.transcript);
            }
        },
        'onMessageResponse': (data) => {
            // When a processed message is available
            console.log('onMessageResponse', JSON.stringify(data));
        },
        'onInsightResponse': (data) => {
            // When an insight is detected
            console.log('onInsightResponse', JSON.stringify(data));
        }
    }
});

The next call is made to startRealtimeRequest of the Javascript SDK and includes various parameters passed in.

Let's breakdown the configuration and take a look at them one by one.

  1. id: The unique ID that represents this stream. (This needs to be unique, which is why we are using uuid)

  2. insightTypes: This array represents the type of insights that are to be detected. Today the supported ones are action_item and question.

  3. config: This configuration object encapsulates the properties which directly relate to the conversation generated by the audio being passed.

    a. meetingTitle : This optional parameter specifies the name of the conversation generated. You can get more info on conversations here

    b. confidenceThreshold : This optional parameter specifies the confidence threshold for detecting the insights. Only the insights that have confidenceScore more than this value will be returned.

    c. timezoneOffset : This specifies the actual timezoneOffset used for detecting the time/date related entities.

    d. languageCode : It specifies the language to be used for transcribing the audio in BCP-47 format. (Needs to be same as the language in which audio is spoken)

    e. sampleRateHertz : It specifies the sampleRate for this audio stream.

  4. speaker: Optionally specify the details of the speaker whose data is being passed in the stream. This enables an e-mail with the Summary UI URL to be sent after the end of the stream.

  5. handlers: This object has the callback functions for different events

    a. onSpeechDetected: To retrieve the real-time transcription results as soon as they are detected. We can use this callback to render live transcription which is specific to the speaker of this audio stream.

    b. onMessageResponse: This callback function contains the "finalized" transcription data for this speaker and if used with multiple streams with other speakers this callback would also provide their messages.
    The "finalized" messages mean that the ASR has finalized the state of this part of transcription and has declared it "final".

    c. onInsightResponse: This callback would provide with any of the detected insights in real-time as they are detected. As with the onMessageResponse above this would also return every speaker's insights in case of multiple streams.

Retrieve audio data from mic

console.log('Successfully connected.');

const micInputStream = micInstance.getAudioStream();
micInputStream.on('data', (data) => {
    // Push audio from Microphone to websocket connection
    connection.sendAudio(data);
});

console.log('Started listening to Microphone.');

After the startRealtimeRequest returns successfully, it signifies that the connection has been established successfully with the passed configuration.
In the above snippet we now obtain the audio data from the micInputStream and as it's received we relay it to the active connection instance we now have with Javascript SDK.

Stop the stream

setTimeout(async () => {
    // Stop listening to microphone
    micInstance.stop();
    console.log('Stopped listening to Microphone.');
    try {
        // Stop connection
        await conversationData = connection.stop();
        console.log('Conversation ID: ' + conversationData.conversationId);
        console.log('Connection Stopped.');
    } catch (e) {
        console.error('Error while stopping the connection.', e);
    }
}, 60 * 1000); // Stop connection after 1 minute i.e. 60 secs

For the purpose of demoing a continuous audio stream we now simulate a stop on the above stream after 60 seconds.
The connection.stop() would close the active connection and will trigger the optional email if the speaker config is included.
Here the conversationData variable includes the conversationId you can use with the Conversation API to retrieve this conversation's data.

And that's it! This marks the completion of streaming audio in real-time (Single Audio Stream) with Javascript SDK.
The complete code for the example explained above can be found here

With Multiple Streams

The same example explained above can be deployed on multiple machines, each with one speaker to simulate the multiple streams use-case.
The only thing common needs to be the unique ID created in the above example which is used to initialize startRealtimeRequest request.

Having this unique ID in common across all different ensures that the audio streams of all the speakers are bound the context of a single conversation.
This conversation can be retrieved by the conversationId via the Conversation API which will include the data of all the speakers connecting using the same common ID.

Set Language When Connecting to a Web Socket

Getting Started

This snippet shows how to use languages other than English and also how to set the timezone to the timezone in which the conversation is taking place.

🚧

Currently, we only support English language in Streaming & Telephony API.
We support languages other than English only for our enterprise plan.
Please feel free to reach out to us at [email protected] for any queries.

Utilizing other languages

The Javascript SDK allows you to work with audio from multiple different languages.

❗️

  1. If the language is not specified then en-US(English - United States) is used as the default language.
  2. Insights like Action items, follow-ups, topics, etc are detected for English language only.

Code Snippet

Configuration Snippet

Here you set the language key to Japanese: "languages": ["ja-JP"],.

{
  "type": "start_request",
  "meetingTitle": "Websockets How-to", // Conversation name
  "insightTypes": ["question", "action_item"], // Will enable insight generation
  "config": {
    "confidenceThreshold": 0.5,
    "languageCode": "ja-JP",
    "speechRecognition": {
      "encoding": "LINEAR16",
      "sampleRateHertz": 44100,
    }
  },
  "speaker": {
    "userId": "[email protected]",
    "name": "Example Sample",
  }
}

This configuration will be passed to the startRealtimeRequest function during initialization, which you can see in the full code snippet below:

Full Snippet

const {sdk} = require('@symblai/symbl-js');
const uuid = require('uuid').v4;

(async () => {
  try {
    // Initialize the SDK
    await sdk.init({
      appId: appId,
      appSecret: appSecret,
      basePath: 'https://api.symbl.ai',
    })

    // Need unique Id
    const id = uuid();

    // Start Real-time Request (Uses Real-time WebSocket API behind the scenes)
    const connection = await sdk.startRealtimeRequest({
      id,
      insightTypes: ['action_item', 'question'],
      config: {
        meetingTitle: 'My Test Meeting',
        confidenceThreshold: 0.7,
        timezoneOffset: 480, // Offset in minutes from UTC
        languageCode: 'ja-JP',
        sampleRateHertz: 44100,
      },
      speaker: {
        // Optional, if not specified, will simply not send an email in the end.
        userId: 'emailAddress', // Update with valid email
        name: 'My name'
      },
      handlers: {
        /**
         * This will return live speech-to-text transcription of the call.
         */
        onSpeechDetected: (data) => {
          console.log(JSON.stringify(data))
          if (data) {
            const {punctuated} = data
            console.log('Live: ', punctuated && punctuated.transcript)
          }
        },
        /**
         * When processed messages are available, this callback will be called.
         */
        onMessageResponse: (data) => {
          console.log('onMessageResponse', JSON.stringify(data, null, 2))
        },
        /**
         * When Symbl detects an insight, this callback will be called.
         */
        onInsightResponse: (data) => {
          console.log('onInsightResponse', JSON.stringify(data, null, 2))
        },
        /**
         * When Symbl detects a topic, this callback will be called.
         */
        onTopicResponse: (data) => {
          console.log('onTopicResponse', JSON.stringify(data, null, 2))
        }
      }
    });
  } catch (e) {
    console.error(e);
  }
})();

Testing

Create a javascript file named app.js and copy this code into the file. Fill in the placeholder values with the proper values. Use npm to install the required libraries: npm install @symblai/symbl-js. Now in the terminal run

$ node app.js

If successful you should receive a response in the console.

📘

If you have any questions or concerns about our API, you can join our Support Slack or send us an email at [email protected]