Categories: FAANG

Accelerate multilingual workflows with a customizable translation solution built with Amazon Translate

ML 10275 image001

Enterprises often need to communicate effectively to a large base of customers, partners, and stakeholders across several different languages. They need to translate and localize content such as marketing materials, product content assets, operational manuals, and legal documents. Each business unit in the enterprise has different translation workloads and often manages their own translation requirements and vendors. While this distributed approach may give business units translation autonomy and flexibility, it becomes difficult for enterprises to maintain translation consistency across the enterprise.

Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation. Today, Amazon Translate supports scalable language translation for over 5,500 language pairings in batch and real time. It can be used to build solutions that address the challenge enterprises with multiple business units face when looking for ways to accelerate multilingual workflows with customization support.

For example, the BMW Group needed a unified translation solution to help their business units, such as Sales and Manufacturing, use translation technology at scale and remove common mistranslation issues across the enterprise. Their solution with Amazon Translate reduces translation time by over 75% while simultaneously giving each business unit the ability to customize the output to address their specific translation requirements.

In this blog post, we demonstrate how to build a unified translation solution with customization features using Amazon Translate and other AWS services. We’ll also show you how to install and test the solution and how you can build a customizable and scalable translation solution for users depending on their department’s localization needs.

Solution overview

The solution uses Amazon Translate’s native features such as real-time translation, automatic source language detection, and custom terminology. Using Amazon API Gateway, these features are exposed as one simple /translate API. Custom terminology allows you to define specific custom translation pairs. In order for custom terminology to work, you need to upload a terminology file to Amazon Translate. Therefore, another API /customterm is exposed.

The solution illustrates two options for translation: a standard translation and a customized translation (using the custom terminology feature). However, you can modify these options as needed to suit your business requirements. Consumers can use these options using API Gateway’s API keys. When a translation request is received by the API, it validates the request (using an AWS Lambda authorizer function) whether the provided API key is authorized to perform the type of translation requested. We use an Amazon DynamoDB table to store metadata information about consumers, permissions, and API keys.

This solution caters to three persona types:

Standard translation persona – Users within a business unit having no customization requirements. This includes standard translation options and features such as automatic language detection of Amazon Translate.
Customized translation persona – Users within a business unit having customization requirements. This includes all the features for standard translation as well as the ability to customize the translations using a custom terminology file.
Admin persona – Supports the customized translation option by managing the uploading of custom terminology files but is not able to make any other translation API calls.

The following diagram illustrates the centralized translation solution with customization architecture.

For the user translation persona, the process includes the following actions (the blue path in the preceding diagram):

1a. Call the /translate API and pass the API key in the API header. Optionally, for the customized translation persona, the user can enable custom translation by passing in an optional query string parameter (useCustomTerm).

2. API Gateway validates the API key.

3. The Lambda custom authorizer is called to validate the action that the supplied API key is allowed. For instance, a standard translation persona can’t ask for custom translation, or an administrator can’t perform any text translation.

4. The Lambda authorizer gets the user information from the DynamoDB table and verifies against the API key provided.

5a. After validation, another Lambda function (Translate) is invoked to call the Amazon Translate API translate_text.

6a. The translated text is returned in the API response.

The admin persona can upload a custom terminology file that can be used by the customized translation persona by calling the /customterm API. The workflow steps are follows (the green path in the preceding diagram):

1b. Call the /customterm API and pass the API key in the API header.

2. API Gateway validates the API key.

3. The Lambda custom authorizer is called to validate the action that the supplied API key is allowed. For instance, only an admin persona can upload custom terminology files.

4. The Lambda authorizer gets the user information from the DynamoDB table and verifies against the API key provided.

5b. After the API key is validated, another Lambda function (Upload) is invoked to call the Amazon Translate API import_terminology.

6b. The custom terminology file is uploaded to Amazon Translate with a unique name generated by the Lambda function.

In the following sections, we walk through the steps to deploy and test the solution.

Prerequisites

To deploy the solution, you need an AWS account. If you don’t already have an AWS account, you can create one. Your access to the AWS account must have AWS Identity and Access Management (IAM) permissions to launch AWS CloudFormation templates that create IAM roles.

Note that you are responsible for the cost of the AWS services used while running this sample deployment. Many of these services (such as Amazon Translate, API Gateway, and Lambda) come with a Free Tier to get you started. For full details, see the pricing pages for each AWS service that you use in this post.

Deploy the solution with AWS CloudFormation

Launch the provided CloudFormation template to deploy the solution in your AWS account. This stack only works in the us-east-1 or eu-west-1 Regions. If you want to deploy this solution in other Regions, refer to the GitHub repo and deploy the CloudFormation in your Region of choice.

Deploy the latest CloudFormation template by following the link for your preferred Region:

Region	CloudFormation Stack
N. Virginia (`us-east-1`)
Ireland (`eu-west-1`)

If prompted, log in using your AWS account credentials.
Leave the fields on the Create stack page with their pre-populated defaults.
Choose Next.
For Stack name, enter the name of the CloudFormation stack (for this post, EnterpriseTranslate).
For DDBTableName¸ enter the name of the DynamoDB table (EnterpriseTranslateTable).
For apiGatewayName, enter the API Gateway created by the stack (EnterpriseTranslateAPI).
For apiGatewayStageName, enter the environment name for API Gateway (prod).
Choose Next.
On the review page, select the check boxes to acknowledge the creation of IAM resources.This is required to allow CloudFormation to create a role to grant access to the resources needed by the stack and name the resources in a dynamic way.
Choose Create stack.

You can monitor the stack creation progress on the Events tab. The stack is complete when the stack status shows as CREATE_COMPLETE.

The deployment creates the following resources (all prefixed with EntTranslate):

An API Gateway API with two resources called /customterm and /translate, with three API keys to represent two translation personas and an admin persona
A DynamoDB table with three items to reflect one consumer with three different roles (three API keys)
Several Lambda functions (using Python 3.9) as per the architecture diagram

After the resources are deployed into your account on the AWS Cloud, you can test the solution.

Collect API keys

Complete the following steps to collect the API keys:

Navigate to the Outputs tab of the CloudFormation stack and copy the value of the key apiGatewayInvokeURL.To find the API keys created by the solution, look in the DynamoDB table you just created or navigate to the API keys page on the API Gateway console. This post uses the latter approach.
On the Resources tab of the CloudFormation stack, find the logical ID EntTranslateApi for API Gateway and open the link under the Physical ID column in a new tab.
On the API Gateway console, choose API Keys in the navigation pane.
Note the three API keys (standard, customized, admin) generated by the solution. For example, select standard key EntTranslateCus1StandardTierKey and choose Show link against the API key property.

Now you can test the APIs using any open-source tools of your choosing. For this post, we use the Postman API testing tool for illustration purposes only. For details on testing API with Postman, refer to API development overview.

Test 1: Standard translation

To test the standard translation API, you first create a POST request in Postman.

Choose Add Request in Postman.
Set the method type as POST.
Enter the API Gateway invoke URL from Output tab of deployed CloudFormation stack.
Add /translate to the URL endpoint.
On the Headers tab, add a new header key named x-api-key.
Enter the standard API key value (copied in Collect API keys stage).
On the Body tab, select Raw and enter a JSON body as follows:
```
{   "sourceText": "some text to translate",   "targetLanguage": "fr",   "sourceLanguage":"en"}
```
sourceLanguage is an optional parameter. If you don’t provide it, the system will set it as auto for the automatic detection of the source language.
Call the API by choosing Send and verify the output.

The API should run successfully and return the translated text in the Body section of the response object.

Test 2: Customized translation with custom terminology

To test the custom term upload functionality, we first create a PUT request in Postman.

Choose Add Request in Postman.
Set the method type as PUT.
Enter the API Gateway invoke URL.
Add /customterm to the end of the URL.
On the Headers tab, add a new header key named x-api-key.
Enter the admin API key value (copied in Collect API keys stage).
On the Body tab, change the format to binary and upload the custom term CSV file. A sample CSV file is provided under the /Resources folder in GitHub repo.
Call the API by choosing Send and verify the output.

The API should run successfully with a message in the Body section of the response object saying “Custom term uploaded successfully”
On the Amazon Translate console, choose Custom Terminology in the navigation pane.
A custom terminology file should have been uploaded and is displayed in the terminology list. The file name syntax is the customer ID from the DynamoDB table for the selected API key followed by string _customterm_1.
Note that if you didn’t use the admin API key, the system will fail to upload the custom term file.Now you’re ready to perform your custom translation.
Choose Add Request in Postman.
Set the method type as POST.
Enter the API Gateway invoke URL.
Add /translate to the URL endpoint.
On the Headers tab, add a new header key named x-api-key.
Enter the standard API key value.

On the Body tab, enter a JSON body as follows:

{   "sourceText": "some text to translate",   "targetLanguage": "fr",   "sourceLanguage":"en"}

On the Params tab, add a new query string parameter named useCustomTerm with a value of 1.
Call the API by choosing Send and verify the output.The API should fail with the message “Unauthorized.” This is because you’re trying to call a customized translation feature using a standard persona API key.
On the Headers tab, enter the customized API key value.
Run the test again, and it should be able to translate using the custom terminology file.

You will also notice that this time the translated text keeps the word “translate” without translating it (if you used the sample file provided). This is due to the fact that the custom terminology file that was previously uploaded has the word “translate” in it, suggesting that the custom terminology modified the base output from Amazon Translate.

Test 3: Add additional consumers and business units

This solution deployed one consumer (customerA) with three different API keys as part of the CloudFormation stack deployment. You can add additional consumers by creating a new usage plan in API Gateway and associating new API keys to this usage plan. For more details on how to create usage plans and API keys, refer to Creating and using usage plans with API keys. You can then add these API keys as additional entries in the DynamoDB table.

Clean up

To avoid incurring future charges, clean up the resources you created as part of the CloudFormation stack:

On the AWS CloudFormation console, navigate to the stack you created.
Select the stack and choose Delete stack.

Your stack might take some time to be deleted. You can track its progress on the Events tab. When the deletion is complete, the stack status changes from DELETE_IN_PROGRESS to DELETE_COMPLETE. It then disappears from the list.

Considerations

Consider the following when using this solution:

API calls for this solution are slower than calling the Amazon Translate API directly. This is because the solution is implementing additional business logic and using additional services (API Gateway and Lambda).
Please note the Amazon Translate service limits for synchronous real-time translation and custom terminology files.
This solution is focused on exposing an API using an API key. If you plan to take this to production environments, consider an authentication mechanism using open industry standards (like OIDC) to authenticate the request first. For more information, refer to Managing multi-tenant APIs using Amazon API Gateway.

Conclusion

In this post, we demonstrated how easy it is to perform real-time translation, upload custom terminology files, and do custom translation in Amazon Translate using its native APIs, and created a solution to support customization with API Gateway.

You can extend the solution with customizations that are relevant to your business requirements. For instance, you can provide additional functionality such as Active Custom Translation using parallel data via another API key, or create a caching layer to work with this solution to further reduce the cost of translations and serve frequently accessed translations from a cache. You can enable API throttling and rate limiting by taking advantage of API Gateway features. The possibilities are endless, and we would love to hear how you take this solution to the next level for your organization by submitting an AWS Contact Us request. You can start customizing this solution by going to the GitHub repo for this blog.

For more information about Amazon Translate, visit Amazon Translate resources to find video resources and blog posts, and also refer to Amazon Translate FAQs. If you’re new to Amazon Translate, try it out using the Free Tier, which offers up to 2 million characters per month for free for the first 12 months, starting from your first translation request.

About the author

Fahad Ahmed is a Solutions Architect at Amazon Web Services (AWS) and looks after Digital Native Businesses in the UK. He has 17+ years of experience building and designing software applications. He recently found a new passion of making AI services accessible to the masses.

AI Generated Robotic Content