Enable Voice-based Interactions in your Retool Application

Introduction:

When facing problems or have questions, a company’s users usually reach out to the company’s support team by either chatting, texting, calling, or sending voice messages. These calls between users and the support team are usually recorded for quality and training purposes. However, how do companies manage all these audio files? How easy is it to find specific records, search for specific users' previous discussions or messages, and in general manage all these audio records that pile up?

Now, consider a solution that can seamlessly convert these audio recordings and messages into text and automatically incorporate them into support tickets or organize them in a database. How cool would that be? Such a solution would ensure a streamlined process, making it easier for support teams to address customer concerns promptly and efficiently. I tried creating something like that using Retool and more specifically, using a Retool Workflow.

Since Retool apps can query data from a variety of sources, it's much more efficient to prepare data outside of the front-end application. This is where building a Workflow helps.

This tutorial explains how to build an Audio to Text Conversion Workflow that:

  • Takes an audio (.wav) file URL.
  • Send the audio file to ApyHub’s speech-to-text Service
  • Return the text response.

Prerequisites:

This tutorial uses the following resources to demonstrate a real-world use case of Retool Workflows:

1. Create an ApyHub Account.

To create your account, go to the ApyHub Signup Page. You also have the option to sign up using your Google Account or GitHub Account.

Apyhub Signup Page

2. Create a new workflow

Sign in to Retool, select the Workflows tab in the navigation bar, and then click Create New and select +Workflow. Next, set the name to the Audio to Text Conversion Workflow.

new-workflow-retool

3. Configure Start Block

The workflow initially contains two blocks: the startTrigger Block and the code1 Block. You can delete the code1 block since we don’t need it.

This workflow should run every time it triggers from your Retool App. To configure it:

  1. Select the block and click on Edit triggers to expand its settings.
  2. Set the Trigger to Webhook.
  3. Write the JSON text in Test JSON Parameters.
{ "incomingAudioMessageURL": "https://assets.apyhub.com/samples/sampleEnglishVoice.wav"}

Note: Here we’re assuming that your app will send a .wav file to this workflow. For this tutorial, we are using a sample .wav file URL.

starterBlog

The connecting lines between blocks in a workflow represent the control flow. Workflow blocks connect to perform specific actions in sequential order, beginning with the Start block. Once a block has completed its action, it triggers the next block in the chain, and so on.

3. Send the Audio File to ApyHub’s Speech to Text Service.

Now that we have an audio file, let’s convert it to text. For this task, we will use a Resource query Block.

The Resource query block in this workflow retrieves the audio file from the startTrigger Block and makes an API Call to ApyHub’s Speech-to-Text service.

apyhub-speech-to-text

To configure it:

  1. Add the Resource Query block and connect it with the startTrigger block.

    image11.gif

  2. Select Resource Type as RESTQuery ( restapi )

  3. Set the method: POST

  4. Write the source URL: https://api.apyhub.com/stt/url (You can find this value in API Documentation )

  5. Click on + More Options

  6. Provide the below header details:

Content-Type application/json
apy-token APY**************************************** ( YOUR SECRET TOKEN )
  1. Set the Body as JSON and provide the below values.
url {{startTrigger.data.incomingAudioMessageURL}}
language en-IN
  1. Click to run the query and rename the block to convertVoiceToText.

run

Some additional steps: Some additional steps: You might encounter a timeout error, which depends on the size of the audio file. So, if you encounter the error mentioned below :arrow_down:

error

Here’s how to resolve it:

  1. Click on the tab Settings.
  2. Set Timeout After (ms): 100000
  3. Click again to run the query.

fail-rerun

4. Retrieve Text

Finally, it’s time to retrieve the converted response ( text ) We will use Code

  1. Add the Code block and connect it with the convertVoiceToText block

    image9.gif

  2. Select JavaScript as a programming language.

  3. Delete the existing return statement and insert this code.

return convertVoiceToText.data.data;
  1. Click to run the query and rename the block to returnText.

final-workflow

5. Test and Run the Workflow

Now that the workflow is complete, you can manually run the workflow by clicking Run on the right of the Workflow Editor toolbar.

Workflows are not triggered automatically by default. After verifying that the workflow runs as expected, toggle Enable. This activates the Start block's trigger so that it runs on its configured schedule automatically.

image3.gif

Wrap up

By using ApyHub, you have now successfully created a workflow that takes audio files from your Retool app, converts the voice message to text, and sends the text back.

This is just one of the use cases that we have explored, demonstrating the flexibility and power of integrating ApyHub with Retool.

By checking ApyHub's Catalog, which offers more than 75+ Utility services, you can enhance the functionality of your Retool app. Have any questions, concerns, or ideas? Share it in the comment section and I would be happy to answer!

2 Likes

This is great. We can improve the workflow by doing a sentiment analysis of the text and assigning a category like "Serious Issue", etc.