Today, more and more IVR speech recognition applications begin with an open-ended prompt supported by a large statistical language model (SLM) grammar. The prompt invites callers to speak a short phrase describing what they want. For example, “Thank you for calling State Bank. How can I help you? You can say things like, ‘What’s my balance?’ or ‘Where can I find an ATM?’ Now, tell me what you’re calling about.” At West, we find that responses fall into four general categories:
The Perplexed Caller
These callers aren’t sure what to do. If they realize they are talking to a machine, they think that the prompt was a menu of options. In that case, a caller might say, “None of those.” Or, they might respond with something general like, “I’ve got a question,” if they get out anything at all. The best thing to do for these folks is to drop them into a more structured environment where they are given specific options to choose from.
The Detailed Caller
These callers may or may not realize they are speaking to a machine as they launch into a detailed description of their situation and needs. The SLM may be able to pick out a few key words that are sufficient to route the call correctly, but given so much material it is just as likely to be confused — not unlike a call center operator on his/her first day on the job.
If the SLM does come up with something, it’s best to confirm it before proceeding. If the SLM is not sure, ask again but emphasize the need for a simple, short response.
The Experienced Caller
These callers have used the system, or one like it, and know how to get through it efficiently. Their responses are often short and specific (what is sometimes called telegraphic speech) consisting of just key words, leaving out any words that are not semantically necessary — much like the way a telegram would have been written years ago. Perhaps they’ve learned that the recognizer works better with this type of input rather than a grammatically complete response.
When the bulk of the caller audience responds this way, the SLM should evolve to support them but will still retain the ability to handle full sentence responses from new callers and the more loquacious.
Finally, we have the callers who, for a variety of reasons choose not give the recognizer a chance and immediately request an agent or press the “O” key. It could be that the caller is familiar with the application and knows that it cannot handle the issue they have today. More often, these callers have an aversion to automated phone systems in general.
Winning over these folks is virtually impossible, but it is best to keep them in automation just a little longer, if for no other reason than to find out what they are calling about so they are routed to the correct department. If the caller has to wait on hold to speak to an agent, who then tells them they need to be transferred to another queue, then this will minimize an aversion to automated phone systems.
At the end of the day, the better a speech application can accommodate a variety of different types of responses, the faster customer needs can be taken care of. This equates to more satisfied and loyal customers, which is really what all companies want.