User Rating: 4 / 5

Star Active Star Active Star Active Star Active Star Inactive
 
speech-to-text.png

This is not a new capability. Since 4.2 (maybe before), FusionPBX has offered this capability. When a caller sends a voicemail message, FusionPBX is able to convert it to a text as well. This is really handy, not everybody has the patience to listen to the sounds. Sometimes, reading is faster. This service is possible because of the Bing API from Microsoft.

Since 2018 (not sure when), Microsoft has deprecated Bing API in favour of Azure. This means that legacy users may still use it (for a while, we don't know yet when it will drop completely) but new users who want to use the transcribe service won't be able to do it. Not using Bing way. 

I will explain here how I figure out to make it work with Azure.

Sign up for Azure Account

You will need an Azure Account. Don't worry, they are for free. Azure will give you a 260 $ dollar credit if you want to try the full potential of the platform. However, there is a free plan for speech service. If I recall for good, the free plan only allows one transcript translation at the time (no simultaneous). So, let's pretend you can sign up and you understand the billing schemas from Azure.

The next thing you will need to do is to create a resource.

azure creating resource

The important thing here is you remember the location you select. You must know that when getting the keys, they are attached to the location. If you use a key attached to Asia in a North America East server, it won't work. I learned this the hard way.

Once you create the resource, you can take note of the key number one. This is the only one my patch uses.

The Patch for Azure Voicemail Transcribing on FusionPBX

I have sent pull requests #4067 and #4068. Let's hope they are accepted. I will update the article when I get an answer regarding this.

UPDATE: The pull requests have been accepted.

I will now explain what the patch does:

  • for legacy Bing users, it adds more fault tolerance. Bing always answers with a JSON payload, however, it not always work. So, my patch doesn't assume it works, it verifies the JSON payload to know if the transcription was successful or not.
  • for new Azure users, here it is the fun:
    Use the following default settings:
    Category Subcategory Type Value Description
    Voicemail azure_key1 text (Azure key) The azure key from the resource 
    azure_server_region text (Azure region host) The azure region host, if the region es East US, the value should be "eastus". Check the full list of Azure regions
    transcribe_enabled boolean true Enable it in the system. You will need to enable it in the voicemail settings as well.
    transcribe_language text en-US The language you will use. Although FusionPBX supports default settings per domain, the LUA script doesn't check the domain (I will need to read that part of the code further). So for now, assume it is system-wide.
    transcribe_provider text azure Set this value to use my patch

    With these settings, my patch is able to get the transcription and dig into the JSON payload to get it.

    My patch also adds Memcached support. I read somewhere in Azure documentation that the Access Token you negotiate with the key it is valid for ten minutes. Therefore, we can save time and resources by saving it. I have hardcoded to seven minutes. If you are running FusionPBX as a personal PBX you may not find this useful, but if you are in the VoIP business with a PBX that answers more than 1000 calls a day with voicemail transcript, you will find it really handy.

You are ready to go.

Good luck!

blog comments powered by Disqus

About

Read about IT, Migration, Business, Money, Marketing and other subjects.

Some subjects: FusionPBX, FreeSWITCH, Linux, Security, Canada, Cryptocurrency, Trading.