ML 14790 arch diag
When building voice-enabled chatbots with Amazon Lex, one of the biggest challenges is accurately capturing user speech input for slot values. For example, when a user needs to provide their account number or confirmation code, speech recognition accuracy becomes crucial. This is where transcription confidence scores come in to help ensure reliable slot filling.
Transcription confidence scores indicate how confident Amazon Lex is in converting speech to text for slot values. These scores range from low to high and are separate from intent/entity recognition scores. For each spoken slot value, Lex provides a confidence score that you can use to:
Here are some ways to leverage confidence scores for better slot handling:
The post includes an AWS CloudFormation template, to demonstrate these patterns, that you can deploy in your AWS account.
Using confidence scores strategically helps create more robust slot filling experiences that:
By implementing these patterns, you can significantly improve the accuracy and reliability of slot filling in your Lex voice bots while maintaining a natural conversation flow.
The following diagram illustrates the overall architecture for the solution.
We use an Amazon Lex bot integrated with an Amazon Connect contact flow to deliver the conversational experience.
Amazon Lex asks the caller “In a few words, how can I help you today?” Amazon Lex reviews the transcription confidence score of the slot values to influence the confirmation routine.
The following are different customer experiences a caller may have depending on what they say and the transcription confidence scores returned by Amazon Lex:
Virtual Agent:
Thank you for calling Acme Travel.
Virtual Agent:
…In a few words, what is the reason for your call today?
Customer:
I’d like to check my booking.
Virtual Agent:
That’s great, please say your 5 character booking reference, you will find it at the top of the information pack we sent.
Customer:
1 9 A K B (Recognized with low transcription confidence: 0.4)
Virtual Agent:
Ok, let’s try again. What is your booking reference?
Customer:
1 9 A A B (More clearly pronounced: recognized with 0.8)
Virtual Agent:
Your booking 1 9 A A B is currently being progressed.
Virtual Agent:
Thank you for calling Acme Travel.
Virtual Agent:
…In a few words, what is the reason for your call today?
Customer:
I’d like to check my booking 1 9 A Z Y.
Virtual Agent:
Your booking 1 9 A Z Y is currently being progressed.
Virtual Agent:
Thank you for calling Acme Travel.
Virtual Agent:
…In a few words, what is the reason for your call today?
Customer:
I’d like to check my booking.
Virtual Agent:
That’s great, please say your 5 character booking reference, you will find it at the top of the information pack we sent.
Customer:
1 9 A Z Y (recognized with a transcription confidence of 0.6)
Virtual Agent:
Did you say 1 9 A Z Y? Please say yes or no.
Customer:
Yes
Virtual Agent:
Your booking 1 9 A Z Y is currently being progressed.
In the example conversations, the IVR requests the booking reference from the customer. Once received, the transcription confidence score is evaluated by enabling conditional branching in Amazon Lex based on speech confidence scores. These conditions check the value against specific thresholds. If the transcription confidence score exceeds the high threshold (for example, greater than 0.7), the conversation progresses to the next state. If the score falls in the medium confidence range (for example, between 0.4–0.7), the user is asked to confirm the interpreted input. Finally, if the score falls below a minimum threshold (for example, lower than 0.4), the user is prompted to retry and provide the information again. This approach optimizes the conversation flow based on the quality of the input captured and prevents erroneous or redundant slot capturing, leading to an improved user experience while increasing the self-service containment rates.
You need to have an AWS account and an AWS Identity and Access Management (IAM) role and user with permissions to create and manage the necessary resources and components for this application. If you don’t have an AWS account, see How do I create and activate a new Amazon Web Services account?
Additionally, you need an Amazon Connect instance—you use the instance Amazon Resource Name (ARN) in a later step.
To create the sample bot and configure the runtime phrase hints, perform the following steps. For this example, we create an Amazon Lex bot called disambiguation-bot, one intent (CheckBooking
), and one slot type (BookingRef
).
contact-center-transcription-confidence-scores
.lex-check-booking-sample-flow
).After you create your intent (CheckBooking
), use you can Visual conversation builder to configure your transcription confidence score logic.
The following figure is an example of how we add logic to the intent. Highlighted in red is the branch condition where we use the transcription confidence score to dynamically change the customer experience and improve accuracy.
If you choose the node, you’re presented with the following configuration options, which is where you can configure the branch condition.
To test the solution, we examine a conversation with words that might not be clearly understood.
Amazon Connect will ask “Thank you for calling Acme travel, In a few words, what is the reason for your call today?”
This test checks the confidence score and will either say “your booking 1 9 A Z Y is currently being progressed” or it will ask you to confirm “1 9 A Z Y”.
Audio transcription confidence scores are available only in the English (GB) (en_GB
) and English (US) (en_US
) languages. Confidence scores are supported only for 8 kHz audio input. Transcription confidence scores aren’t provided for audio input from the test window on the Amazon Lex V2 console because it uses 16 kHz audio input.
To remove the infrastructure created by the CloudFormation template, open the AWS CloudFormation console and delete the stack. This will remove the services and configuration installed as part of this deployment process.
Optimizing the user experience is at the forefront of any Amazon Lex conversational designer’s priority list, and so is capturing information accurately. This new feature empowers designers to have choices around confirmation routines that drive a more natural dialog between the customer and the bot. Although confirming each input can slow down the user experience and cause frustration, failing to confirm when transcription confidence is low can risk accuracy. These improvements enable you to create a more natural and performant experience.
For more information about how to build effective conversations on Amazon Lex with intent confidence scores, see Build more effective conversations on Amazon Lex with confidence scores and increased accuracy.
100% Made with opensource tools: Flux, WAN2.1 Vace, MMAudio and DaVinci Resolve. submitted by /u/Race88…
The intersection of traditional machine learning and modern representation learning is opening up new possibilities.
We’re introducing an efficient, on-device robotics model with general-purpose dexterity and fast task adaptation.
Today we are excited to introduce the Text Ranking and Question and Answer UI templates…
Box is one of the original information sharing and collaboration platforms of the digital era.…
ChatEHR accelerates chart reviews for ER admissions, streamlines patient transfer summaries and synthesizes complex medical…