ML 12715 Picture2
A critical component of business success is the ability to connect with customers. Businesses today want to connect with their customers by offering their content across multiple languages in real time. For most customers, the content creation process is disconnected from the localization effort of translating content into multiple target languages. These disconnected processes delay the business ability to simultaneously publish content in multiple languages, inhibiting their outreach efforts which negatively impacts time to market and revenue.
Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Now, Amazon Translate offers real-time document translation to seamlessly integrate and accelerate content creation and localization. You can submit a document from the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDK and receive the translated document in real time while maintaining the format of the original document. This feature eliminates the wait for documents to be translated in asynchronous batch mode.
Real-time document translation currently supports plain text and HTML documents. You can use other Amazon Translate features such as custom terminology, profanity masking, and formality as part of the real-time document translation.
In this post, we will show you how to use this new feature.
This post walks you through the steps required to use real-time document translation with the console, AWS CLI, and Amazon Translate SDK. As an example, we will translate this sample text file from English to French.
Follow these steps to try out real-time document translation on the console:
Note: Source or Target language should be English for real-time document translation.
Text and HTML formats are supported at the time of this writing.
For more information about Amazon Translate features, refer to the following resources:
The translated file is automatically saved to your browser’s downloaded folder, usually to Downloads. The target language code will be prefixed to the translated file’s name. For example, if your source file name is lang.txt and your target language is French (fr), then the translated file will be named fr.lang.txt.
You can translate the contents of a file using the following AWS CLI command. In this example, the contents of source-lang.txt will be translated into target-lang.txt.
aws translate translate-document --source-language-code en --target-language es
--document-content fileb://source-lang.txt
--document ContentType=text/plain
--query "TranslatedDocument.Content"
--output text | base64
--decode > target-lang.txt
You can use the following Python code to invoke Amazon Translate SDK API to translate text or HTML documents synchronously:
import boto3
import argparse
# Initialize parser
parser = argparse.ArgumentParser()
parser.add_argument("SourceLanguageCode")
parser.add_argument("TargetLanguageCode")
parser.add_argument("SourceFile")
args = parser.parse_args()
translate = boto3.client('translate’)
localFile = args.SourceFile
file = open(localFile, "rb")
data = file.read()
file.close()
result = translate.translate_document(
Document={
"Content": data,
"ContentType": "text/html"
},
SourceLanguageCode=args.SourceLanguageCode,
TargetLanguageCode=args.TargetLanguageCode
)
if "TranslatedDocument" in result:
fileName = localFile.split("/")[-1]
tmpfile = f"{args.TargetLanguageCode}-{fileName}"
with open(tmpfile, 'w', encoding='utf-8') as f:
f.write(str(result["TranslatedDocument"]["Content"]))
print("Translated document ", tmpfile)
This program accepts three arguments: source language, target language, and file path. Use the following command to invoke this program:
python syncDocumentTranslation.py en es source-lang.txt
The real-time document translation feature in Amazon Translate can expedite time to market by enabling easy integration with content creation and localization. Real-time document translation improves content creation and the localization process.
For more information about Amazon Translate, visit Amazon Translate resources to find video resources and blog posts, and refer to AWS Translate FAQs.
Embeddings — vector-based numerical representations of typically unstructured data like text — have been primarily…
Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval. However, they…
This post is co-written with Sunaina Kavi, AI/ML Product Manager at Omada Health. Omada Health,…
Anthropic released Cowork on Monday, a new AI agent capability that extends the power of…
New York governor Kathy Hochul says she will propose a new law allowing limited autonomous…
Artificial intelligence (AI) is increasingly used to analyze medical images, materials data and scientific measurements,…