Skip to main content

Speech to Text

Symbl offers state-of-the-art Speech into Text capability. You can convert audio and video conversation into text (sometimes this is so-called transcription).

Speech to text

Key Features of Speech to Text#

  1. Transcribe your content from real-time and stored files.

  2. Domain-specific Speech to Text models for mobile call and video calls for state-of-the-art accuracy.

  3. We support 20+ languages including English, Russian, French, Italian, Hindi, Japanese, Spanish, etc. We also support models for different accents. For example, the way we speak American and British English is different for which we have Speech Recognition Models that are fine-tuned to different accents.
    Languages Supported

  4. We support Custom Vocabulary which means help Speech-to-Text recognize specific words or phrases more frequently than other options that might otherwise be suggested. For example, suppose that your audio data often includes the word "sell". When Speech-to-Text encounters the word "sell," you want it to transcribe the word as "sell" more often than "cell." In this case, you might use speech adaptation to bias Speech-to-Text toward recognizing "sell."

  5. Speech-to-Text accurately punctuates transcriptions (e.g., commas, question marks, and periods).

  6. Know who said what by receiving automatic predictions about which of the speakers in a conversation spoke each utterance this is called Speaker Diarization. This process is fairly accurate but not 100% accurate. If you want near 100% accuracy with who said what? Please use audio streams and passing the audio files in channels.

Where can I find Speech to Text API?#


Each continuous sentence spoken by a speaker in conversation is referred to as a Message. Hence, we named our Speech to Text API to Messages API. Messages API returns you a list of messages in a conversation.

To see Messages API in action, you need to process a conversation using Symbl. After you process a meeting, you'll receive a Conversation ID. A Conversation ID is the key to receiving conversational insights from any conversation. As an example, here's a simple API call which grabs the speech-to-text transcription from the conversation.

👉 Messages API

Grab speech-to-text transcription#

Remember to replace the conversationId in the API call with the Conversation ID you get from the previous API call.

curl "{conversationId}/messages" \
-H "Authorization: Bearer $AUTH_TOKEN"

Our customers love our Speech to Text! ❤️#