Code snippets - Streaming API
Subscribe to real-time events
The Symbl.ai JavaScript SDK also lets you subscribe to real-time events when you connect to one of the Endpoints specified in the above sections. These include:
- Real-Time Transcription
- Real-Time Insights
- Real-Time Messages
- Real-Time Intents
The below example shows how to achieve this:
Initialize SDK
const {sdk, SpeakerEvent} = require("@symblai/symbl-js");
sdk.init({
// APP_ID and APP_SECRET come from the Symbl Platform: https://platform.symbl.ai
appId: APP_ID,
appSecret: APP_SECRET
}).then(async () => {
console.log('SDK initialized.');
try {
// You code goes here.
} catch (e) {
console.error(e);
}
}).catch(err => console.error('Error in SDK initialization.', err));
Add the above lines to import and initialize the SDK. Replace the APP_ID
and APP_SECRET
in the code. You can find the them by signing up on the Symbl.ai platform.
Make a phone call
const connection = await sdk.startEndpoint({
endpoint: {
type: 'pstn', // when making a regular phone call
phoneNumber: 'PHONE_NUMBER' // include country code
}
});
const {connectionId} = connection;
console.log('Successfully connected. Connection Id: ', connectionId);
The above snippet makes a phone call, by calling the startEndpoint
with type set to pstn
and a valid US/Canada Phone Number.
You can also call in via type sip
as well with the steps below remaining the same.
Subscribe to the Live Events
// Subscribe to connection using connectionId.
sdk.subscribeToConnection(connectionId, (data) => {
const {type} = data;
if (type === 'transcript_response') {
const {payload} = data;
// You get live transcription here!!
process.stdout.write('Live: ' + payload && payload.content + '\r');
} else if (type === 'message_response') {
const {messages} = data;
// You get processed messages in the transcript here!!! Real-time but not live! :)
messages.forEach(message => {
process.stdout.write('Message: ' + message.payload.content + '\n');
});
} else if (type === 'insight_response') {
const {insights} = data;
// See <link here> for more details on Insights
// You get any insights here!!!
insights.forEach(insight => {
process.stdout.write(`Insight: ${insight.type} - ${insight.text} \n\n`);
});
}
});
The above snippet calls the `subscribeToConnection` which requires the `connectionId` of the call and a callback function to be passed as the second argument which will be invoked when any of the above events are available to be consumed.
The `data` received will contain `type` of the event. It can be one of `transcript_response`, `message_response`, `insight_response`.
Lets go over them one by one:
1. `transcript_response` : This contains the real-time transcription data which is availabe as soon as its detected.
2. `message_response` : This will contain the array of the transcripts of all the speakers which will be logically separated by punctuations or the speakers if [Active Speaker Events](/docs/javascript-sdk/code-snippets/active-speaker-events) are pushed.
3. `insight_response` : This will contain the array of all the insights detected in real-time. These can be [Action Items](/docs/conversation-api/action-items) or questions.
There is also a 4th type of event which is `intent_response` covered in a separate example:
### End the call
```js Node.js
// Stop call after 60 seconds to automatically.
setTimeout(async () => {
const connection = await sdk.stopEndpoint({ connectionId });
console.log('Stopped the connection');
console.log('Conversation ID:', connection.conversationId);
}, 60000); // Change the 60000 with higher value if you want this to continue for more time.
To end the call gracefully, we call the stopEndpoint
call to stop the call.
The code snippet above simply stops the call after 60 seconds.
And we're done! That's how you can consume real-time events using Javascript SDK!
The complete code for the example above can be found here.
Testing
Create a javascript file named app.js
and copy this code into the file. Fill in the placeholder values with the proper values. Use npm to install the required libraries: npm install @symblai/symbl-js
. Now in the terminal run
$ node app.js
If successful you should receive a response in the console.
If you have any questions or concerns about our API, you can join our Support Slack or send us an email at [email protected]
Streaming Audio in Real-Time
This section talks about streaming the audio in real-time using the Javascript SDK. We can use this API to pass in audio via a single stream and multiple isolated streams of audio, each of which can contain one or more speaker's audio data.
If you plan to use the multiple audio streams we recommend using single streams for each speaker involved to get the most accurate of transcription and speaker separation.
You can also consume the processed results in real-time, which include:
- Real-time transcription
- Real-time insights (Action Items and Questions)
- When using multiple audio streams (Each stream for 1 speaker) you also get access to speaker-separated data (including transcription and messages)
Example with Single Stream
The example below utilises the mic
package to stream audio in real-time. This will be a single stream of audio obtained through mic
which may have one or more than one speaker's audio.
Import required packages
const {sdk} = require('@symblai/symbl-js');
const uuid = require('uuid').v4;
// For demo purposes, we're using mic to simply get audio from microphone and pass it on to websocket connection
const mic = require('mic');
In the above snippet we import the sdk
, uuid
and mic
npm packages. The uuid
package is used for generating a unique ID to represent this stream and it's strongly recommended to use it.
The mic
package is used to obtain the audio stream in real-time to pass to the SDK.
Initialise an instance of mic
const sampleRateHertz = 16000;
const micInstance = mic({
rate: sampleRateHertz,
channels: '1',
debug: false,
exitOnSilence: 6
});
We now declare the sampleRateHertz
variable to specify the Sample Rate of the audio obtained from the mic
.
It is imperative to use the same Sample Rate used for initialising the mic
package and for passing in to the startRealtimeRequest
of Javascript SDK as we will see below.
Otherwise the transcription will be completely in-accurate.
We also initialise mic
with channels: '1'
(mono channel) audio as currently only mono channel audio data is supported.
Initialise the Javascript SDK
// Initialize the SDK
await sdk.init({
// APP_ID and APP_SECRET come from the Symbl Platform: https://platform.symbl.ai
appId: APP_ID,
appSecret: APP_SECRET,
basePath: 'https://api.symbl.ai'
});
// Need unique Id
const id = uuid();
Next we initialise a helper function to execute our code in the async/await
style. The following code snippets (including the one just above) will be a part of the same function.
We now initialise the Javascript SDK with the init
call, passing in appId
and appSecret
which you can be obtain by signing up on Symbl.ai platform
We also initialise variable id
with uuid
function for the unique ID required for this stream as was also mentioned above in the import section snippet.
Call the startRealtimeRequest
// Start Real-time Request (Uses Real-time WebSocket API behind the scenes)
const connection = await sdk.startRealtimeRequest({
id,
insightTypes: ["action_item", "question"],
config: {
meetingTitle: 'My Test Meeting',
confidenceThreshold: 0.7,
timezoneOffset: 480, // Offset in minutes from UTC
languageCode: "en-US",
sampleRateHertz
},
speaker: {
// Optional
userId: 'user-identifier',
name: 'My name'
},
handlers: {
'onSpeechDetected': (data) => {
console.log(JSON.stringify(data));
// For live transcription
if (data) {
const {punctuated} = data;
console.log('Live: ', punctuated && punctuated.transcript);
}
},
'onMessageResponse': (data) => {
// When a processed message is available
console.log('onMessageResponse', JSON.stringify(data));
},
'onInsightResponse': (data) => {
// When an insight is detected
console.log('onInsightResponse', JSON.stringify(data));
}
}
});
The next call is made to startRealtimeRequest
of the Javascript SDK and includes various parameters passed in.
Let's breakdown the configuration and take a look at them one by one.
-
id
: The unique ID that represents this stream. (This needs to be unique, which is why we are usinguuid
) -
insightTypes
: This array represents the type of insights that are to be detected. Today the supported ones areaction_item
andquestion
. -
config
: This configuration object encapsulates the properties which directly relate to the conversation generated by the audio being passed.a.
meetingTitle
: This optional parameter specifies the name of the conversation generated. You can get more info on conversations hereb.
confidenceThreshold
: This optional parameter specifies the confidence threshold for detecting the insights. Only the insights that haveconfidenceScore
more than this value will be returned.c.
timezoneOffset
: This specifies the actual timezoneOffset used for detecting the time/date related entities.d.
languageCode
: It specifies the language to be used for transcribing the audio in BCP-47 format. (Needs to be same as the language in which audio is spoken)e.
sampleRateHertz
: It specifies the sampleRate for this audio stream. -
speaker
: Optionally specify the details of the speaker whose data is being passed in the stream. This enables an e-mail with the Summary UI URL to be sent after the end of the stream. -
handlers
: This object has the callback functions for different eventsa.
onSpeechDetected
: To retrieve the real-time transcription results as soon as they are detected. We can use this callback to render live transcription which is specific to the speaker of this audio stream.b.
onMessageResponse
: This callback function contains the "finalized" transcription data for this speaker and if used with multiple streams with other speakers this callback would also provide their messages.
The "finalized" messages mean that the ASR has finalized the state of this part of transcription and has declared it "final".c.
onInsightResponse
: This callback would provide with any of the detected insights in real-time as they are detected. As with theonMessageResponse
above this would also return every speaker's insights in case of multiple streams.
Retrieve audio data from mic
console.log('Successfully connected.');
const micInputStream = micInstance.getAudioStream();
micInputStream.on('data', (data) => {
// Push audio from Microphone to websocket connection
connection.sendAudio(data);
});
console.log('Started listening to Microphone.');
After the startRealtimeRequest
returns successfully, it signifies that the connection has been established successfully with the passed configuration.
In the above snippet we now obtain the audio data from the micInputStream
and as it's received we relay it to the active connection instance we now have with Javascript SDK.
Stop the stream
setTimeout(async () => {
// Stop listening to microphone
micInstance.stop();
console.log('Stopped listening to Microphone.');
try {
// Stop connection
await conversationData = connection.stop();
console.log('Conversation ID: ' + conversationData.conversationId);
console.log('Connection Stopped.');
} catch (e) {
console.error('Error while stopping the connection.', e);
}
}, 60 * 1000); // Stop connection after 1 minute i.e. 60 secs
For the purpose of demoing a continuous audio stream we now simulate a stop
on the above stream after 60 seconds.
The connection.stop()
would close the active connection and will trigger the optional email if the speaker
config is included.
Here the conversationData
variable includes the conversationId
you can use with the Conversation API to retrieve this conversation's data.
And that's it! This marks the completion of streaming audio in real-time (Single Audio Stream) with Javascript SDK.
The complete code for the example explained above can be found here
With Multiple Streams
The same example explained above can be deployed on multiple machines, each with one speaker to simulate the multiple streams use-case.
The only thing common needs to be the unique ID created in the above example which is used to initialize startRealtimeRequest
request.
Having this unique ID in common across all different ensures that the audio streams of all the speakers are bound the context of a single conversation.
This conversation can be retrieved by the conversationId
via the Conversation API which will include the data of all the speakers connecting using the same common ID.
Set Language When Connecting to a Web Socket
Getting Started
This snippet shows how to use languages other than English and also how to set the timezone to the timezone in which the conversation is taking place.
Currently, we only support English language in Streaming & Telephony API.
We support languages other than English only for our enterprise plan.
Please feel free to reach out to us at [email protected] for any queries.
Utilizing other languages
The Javascript SDK allows you to work with audio from multiple different languages.
- If the language is not specified then
en-US
(English - United States) is used as the default language.- Insights like Action items, follow-ups, topics, etc are detected for English language only.
Code Snippet
Configuration Snippet
Here you set the language key to Japanese: "languages": ["ja-JP"],
.
{
"type": "start_request",
"meetingTitle": "Websockets How-to", // Conversation name
"insightTypes": ["question", "action_item"], // Will enable insight generation
"config": {
"confidenceThreshold": 0.5,
"languageCode": "ja-JP",
"speechRecognition": {
"encoding": "LINEAR16",
"sampleRateHertz": 44100,
}
},
"speaker": {
"userId": "[email protected]",
"name": "Example Sample",
}
}
This configuration will be passed to the startRealtimeRequest
function during initialization, which you can see in the full code snippet below:
Full Snippet
const {sdk} = require('@symblai/symbl-js');
const uuid = require('uuid').v4;
(async () => {
try {
// Initialize the SDK
await sdk.init({
appId: appId,
appSecret: appSecret,
basePath: 'https://api.symbl.ai',
})
// Need unique Id
const id = uuid();
// Start Real-time Request (Uses Real-time WebSocket API behind the scenes)
const connection = await sdk.startRealtimeRequest({
id,
insightTypes: ['action_item', 'question'],
config: {
meetingTitle: 'My Test Meeting',
confidenceThreshold: 0.7,
timezoneOffset: 480, // Offset in minutes from UTC
languageCode: 'ja-JP',
sampleRateHertz: 44100,
},
speaker: {
// Optional, if not specified, will simply not send an email in the end.
userId: 'emailAddress', // Update with valid email
name: 'My name'
},
handlers: {
/**
* This will return live speech-to-text transcription of the call.
*/
onSpeechDetected: (data) => {
console.log(JSON.stringify(data))
if (data) {
const {punctuated} = data
console.log('Live: ', punctuated && punctuated.transcript)
}
},
/**
* When processed messages are available, this callback will be called.
*/
onMessageResponse: (data) => {
console.log('onMessageResponse', JSON.stringify(data, null, 2))
},
/**
* When Symbl detects an insight, this callback will be called.
*/
onInsightResponse: (data) => {
console.log('onInsightResponse', JSON.stringify(data, null, 2))
},
/**
* When Symbl detects a topic, this callback will be called.
*/
onTopicResponse: (data) => {
console.log('onTopicResponse', JSON.stringify(data, null, 2))
}
}
});
} catch (e) {
console.error(e);
}
})();
Testing
Create a javascript file named app.js
and copy this code into the file. Fill in the placeholder values with the proper values. Use npm to install the required libraries: npm install @symblai/symbl-js
. Now in the terminal run
$ node app.js
If successful you should receive a response in the console.
If you have any questions or concerns about our API, you can join our Support Slack or send us an email at [email protected]
Updated 9 months ago