This is a request to start the processing after the connection is established. Right after this message has been sent, the audio should be streamed, any binary audio streamed before the receipt of this message will be ignored.
To get direct access to the mic, we're going to use an API in the WebRTC specification called getUserMedia().
Once the code is running, start speaking and you should see the message_response and insight_response messages getting printed on the console.
This is a request to stop the processing. After the receipt of this message, the service will stop any processing and close the WebSocket connection.
The client needs to send the audio to Service by converting the audio stream into a series of audio chunks. Each chunk of audio carries a segment of audio that needs to be processed. The maximum size of a single audio chunk is 8,192 bytes.
This section describes the messages that originate in Service and are sent to the client.
Service sends mainly two types of messages (message_response, insight_response) to the client as soon as they're available.
The message_response contains the processed messages as soon as they're ready and available, in the processing of continuous audio stream. This message does not contain any insights.
The insight_response contains the insights from the ongoing conversation as soon as they are available. This message does not contain any messages.
Example of the message_response object
Example of the insight_response object