
First of all, the numbers: over 120 conference attendees tested our voice recognition software and our mailing list is nearing 700 people. We chose to use the noisy Stage Expo hall to test just how well the voice recognition software can understand voices. And it passed with flying colors! Two choices really boosted the voice recognition:
Use a headset with a noise-cancelling microphone. Noise cancellation at the hardware level really helps so that the software doesn't have to weed out all the background noise. We used a Logitech G35, but there are more affordable options as well.
In early tests, the simulator thought it heard all sorts of words because it was searching through the entire dictionary for matches. But as much as we like to think otherwise, we stage managers have a pretty limited vocabulary while we are calling cues. When we rewrote the voice recognition software to only hear calling terms, the accuracy percentage climbed to the low 90s. For USITT, we went one step further. We combined common words as single units of speech: "standby lights 101" was one word as far as the simulator was concerned, as well as "standby lights 102." Now that each 'word' had multiple syllables, the accuracy climbed above 95% even in a tradeshow hall the size of an airplane hanger.
Our second test, unbeknownst to the conference attendees, was to check the stability of the simulator if it was repeatedly restarted mid-simulation. Our lead programmer was concerned that, if the simulation was interrupted several times, it might need a full program report. In 120+ tests, we only needed to restart the full program fewer than ten times, and the load time for this program is quite short. Leighton was working on a load screen until we found that the simulator loaded faster than users could read the load screen. Great news all around!
Use a headset with a noise-cancelling microphone. Noise cancellation at the hardware level really helps so that the software doesn't have to weed out all the background noise. We used a Logitech G35, but there are more affordable options as well.
In early tests, the simulator thought it heard all sorts of words because it was searching through the entire dictionary for matches. But as much as we like to think otherwise, we stage managers have a pretty limited vocabulary while we are calling cues. When we rewrote the voice recognition software to only hear calling terms, the accuracy percentage climbed to the low 90s. For USITT, we went one step further. We combined common words as single units of speech: "standby lights 101" was one word as far as the simulator was concerned, as well as "standby lights 102." Now that each 'word' had multiple syllables, the accuracy climbed above 95% even in a tradeshow hall the size of an airplane hanger.
Our second test, unbeknownst to the conference attendees, was to check the stability of the simulator if it was repeatedly restarted mid-simulation. Our lead programmer was concerned that, if the simulation was interrupted several times, it might need a full program report. In 120+ tests, we only needed to restart the full program fewer than ten times, and the load time for this program is quite short. Leighton was working on a load screen until we found that the simulator loaded faster than users could read the load screen. Great news all around!