The WebSocket based real-time API by Symbl provides the direct, fastest and most accurate of all other interfaces to push the audio stream in real-time, and get the results back as soon as they're available.
This is a WebSocket endpoint, and hence it starts as an HTTP request that contains HTTP headers that indicate the client's desire to upgrade the connection to a WebSocket instead of using HTTP semantics. The server indicates its willingness to participate in the WebSocket connection by returning an HTTP 101 Switching Protocols response. After the exchange of this handshake, both client and service keep the socket open and begin using a message-based protocol to send and receive information. Please refer to WebSocket Specification RFC 6455 for the more in-depth understanding of the Handshake process.
Client and Server both can send messages after the connection is established. According to RFC 6455, WebSocket messages can have either a text or a binary encoding. The two encodings use different on-the-wire formats. Each format is optimized for efficient encoding, transmission, and decoding of the message payload.
Text message over WebSocket must use UTF-8 encoding. Text Message is the serialized JSON message. Every text message has a type field to specify the type or the purpose of the message.
Binary WebSocket messages carry a binary payload. For the Real-time API, audio is transmitted to the service by using binary messages. All other messages are the Text messages.
This section describes the messages that originate from the client and are sent to service. The types of messages sent by the client are start_request, stop_request and binary messages containing audio.
In the example below, we've used the websocket npm package for WebSocket Client, and mic for getting the raw audio from microphone.