ML 11369 image001
If you operate in a country with multiple official languages or across multiple regions, your audio files can contain different languages. Participants may be speaking entirely different languages or may switch between languages. Consider a customer service call to report a problem in an area with a substantial multi-lingual population. Although the conversation could begin in one language, it’s feasible that the customer might change to another language to describe the problem, depending on comfort level or usage preferences with other languages. In a similar vein, the customer care representative may transition between languages while conveying operating or troubleshooting instructions.
With a minimum of 3 seconds of audio, Amazon Transcribe can automatically identify and efficiently generate transcripts in the languages spoken in the audio without needing humans to specify the languages. This applies to various use cases such as transcribing customer calls, converting voicemails to text, capturing meeting interactions, tracking user forum communications, or monitoring media content production and localization workflows.
This post walks through the steps for transcribing a multi-language audio file using Amazon Transcribe. We discuss how to make audio files available to Amazon Transcribe and enable transcription of multi-lingual audio files when calling Amazon Transcribe APIs.
Amazon Transcribe is an AWS service that makes it easy for you to convert speech to text. Adding speech to text functionality to any application is simple with the help of Amazon Transcribe, an automated speech recognition (ASR) service. You can ingest audio input using Amazon Transcribe, create clear transcripts that are easy to read and review, increase accuracy with customization, and filter information to protect client privacy.
The solution also uses Amazon Simple Storage Service (Amazon S3), an object storage service built to store and retrieve any amount of data from anywhere. It’s a simple storage service that offers industry-leading durability, availability, performance, security, and virtually unlimited scalability at very low cost. When you store data in Amazon S3, you work with resources known as buckets and objects. A bucket is a container for objects. An object is a file and any metadata that describes the file.
In this post, we walk you through the following steps to implement a multi-multilingual audio transcription solution:
For this walkthrough, you should have the following prerequisites:
Amazon Transcribe provide the option to store transcribed output in either a service managed or customer managed S3 bucket. For this post, we have Amazon Transcribe write the results to a service managed S3 bucket.
Note that Amazon Transcribe is a Regional service and the Amazon Transcribe API endpoints being called need to be in the same Region as the S3 buckets.
To create your S3 bucket, complete the following steps:
Upload your multi-lingual audio file to the S3 bucket in your AWS account. For the purpose of this exercise, we use the following sample multi-lingual audio file. It captures a customer support call involving English and Spanish languages.
Your audio file will shortly be available in the S3 bucket.
With the audio file uploaded, we now create a transcription job.
When the transcription job is complete, open the transcription job.
Scroll down to the Transcription preview section. The audio transcription is displayed on the Text tab. The transcription includes both the English and Spanish portions of the conversation.
You can optionally download a copy of the transcript as a JSON file, which you could use for further post-call analytics.
To avoid incurring future charges, empty and delete the S3 bucket you created for storing the input audio source file. Make sure you have the files stored elsewhere because this will permanently remove all objects contained within the bucket. On the Amazon Transcribe console, select and delete the job previously created for the transcription.
In this post, we created an end-to-end workflow to automate identification and transcription of multi-lingual audio files, without writing any code. We used the new functionality in Amazon Transcribe to automatically identify different languages in an audio file and transcribe each language correctly.
For more information, refer to Language identification with batch transcription jobs.
Model: https://huggingface.co/silveroxides/Chroma-GGUF/blob/main/chroma-unlocked-v38-detail-calibrated/chroma-unlocked-v38-detail-calibrated-Q8_0.gguf Workflow: https://huggingface.co/lodestones/Chroma/resolve/main/simple_workflow.json Prompts used: High detail photo showing an abandoned Renaissance painter’s studio…
This post is divided into three parts; they are: • Low-Rank Approximation of Matrices •…
Pandas DataFrames are powerful and versatile data manipulation and analysis tools.
Palantir FedStart and the Path to CMMC ComplianceSecuring the Defense Industrial BaseNever has the imperative…
Time series forecasting helps businesses predict future trends based on historical data patterns, whether it’s…
MIT researchers developed SEAL, a framework that lets language models continuously learn new knowledge and…