ML 15930 image001
Meeting notes are a crucial part of collaboration, yet they often fall through the cracks. Between leading discussions, listening closely, and typing notes, it’s easy for key information to slip away unrecorded. Even when notes are captured, they can be disorganized or illegible, rendering them useless.
In this post, we explore how to use Amazon Transcribe and Amazon Bedrock to automatically generate clean, concise summaries of video or audio recordings. Whether it’s an internal team meeting, conference session, or earnings call, this approach can help you distill hours of content down to salient points.
We walk through a solution to transcribe a project team meeting and summarize the key takeaways with Amazon Bedrock. We also discuss how you can customize this solution for other common scenarios like course lectures, interviews, and sales calls. Read on to simplify and automate your note-taking process.
By combining Amazon Transcribe and Amazon Bedrock, you can save time, capture insights, and enhance collaboration. Amazon Transcribe is an automatic speech recognition (ASR) service that makes it straightforward to add speech-to-text capability to applications. It uses advanced deep learning technologies to accurately transcribe audio into text. Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon with a single API, along with a broad set of capabilities you need to build generative AI applications. With Amazon Bedrock, you can easily experiment with a variety of top FMs, and privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG).
The solution presented in this post is orchestrated using an AWS Step Functions state machine that is triggered when you upload a recording to the designated Amazon Simple Storage Service (Amazon S3) bucket. Step Functions lets you create serverless workflows to orchestrate and connect components across AWS services. It handles the underlying complexity so you can focus on application logic. It’s useful for coordinating tasks, distributed processing, ETL (extract, transform, and load), and business process automation.
The following diagram illustrates the high-level solution architecture.
The solution workflow includes the following steps:
This solution is supported in Regions where Anthropic Claude on Amazon Bedrock is available.
The state machine orchestrates the steps to perform the specific tasks. The following diagram illustrates the detailed process.
Amazon Bedrock users need to request access to models before they are available for use. This is a one-time action. For this solution, you’ll need to enable access to the Anthropic Claude (not Anthropic Claude Instant) model in Amazon Bedrock. For more information, refer to Model access.
The solution is deployed using an AWS CloudFormation template, found on the GitHub repo, to automatically provision the necessary resources in your AWS account. The template requires the following parameters:
After you deploy the solution using AWS CloudFormation, complete the following steps:
AssetBucketName
; it will look something like summary-generator-assetbucket-xxxxxxxxxxxxx
.This is where you’ll upload your recordings. Valid file formats are MP3, MP4, WAV, FLAC, AMR, OGG, and WebM.
recordings
folder.Uploading recordings will automatically trigger the Step Functions state machine. For this example, we use a sample team meeting recording in the sample-recording
directory of the GitHub repository.
Here, you can watch the progress of the state machine as it processes the recording.
Alternatively, you can navigate to the S3 assets bucket and view the transcript there in the transcripts folder.
You will get the recording summary emailed to the address you provided when you created the CloudFormation stack. If you don’t receive the email in a few moments, make sure that you acknowledged the Amazon SNS confirmation email that you should have received after you created the stack and then upload the recording again, which will trigger the summary process.
This solution includes a mock team meeting recording that you can use to test the solution. The summary will look similar to the following example. Because of the nature of generative AI, however, your output will look a bit different, but the content should be close.
Here are the key points from the standup:
- Joe finished reviewing the current state for task EDU1 and created a new task to develop the future state. That new task is in the backlog to be prioritized. He’s now starting EDU2 but is blocked on resource selection.
- Rob created a tagging strategy for SLG1 based on best practices, but may need to coordinate with other teams who have created their own strategies, to align on a uniform approach. A new task was created to coordinate tagging strategies.
- Rob has made progress debugging for SLG2 but may need additional help. This task will be moved to Sprint 2 to allow time to get extra resources.
Next Steps:
- Joe to continue working on EDU2 as able until resource selection is decided
- New task to be prioritized to coordinate tagging strategies across teams
- SLG2 moved to Sprint 2
- Standups moving to Mondays starting next week
Now that you have a working solution, here are some potential ideas to customize the solution for your specific use cases:
max_tokens_to_sample
parameter to accommodate different content lengths.To clean up the solution, delete the CloudFormation stack that you created earlier. Note that deleting the stack will not delete the asset bucket. If you no longer need the recordings or transcripts, you can delete this bucket separately. Amazon Transcribe will automatically delete transcription jobs after 90 days, but you can delete these manually before then.
In this post, we explored how to use Amazon Transcribe and Amazon Bedrock to automatically generate clean, concise summaries of video or audio recordings. We encourage you to continue evaluating Amazon Bedrock, Amazon Transcribe, and other AWS AI services, like Amazon Textract, Amazon Translate, and Amazon Rekognition, to see how they can help meet your business objectives.
Matrices are a key concept not only in linear algebra but also with regard to…
This paper delves into the challenging task of Active Speaker Detection (ASD), where the system…
Based on original post by Dr. Hemant Joshi, CTO, FloTorch.ai A recent evaluation conducted by…
As AI creates opportunities for business growth and societal benefits, we’re working to reduce their…
PlayStation characters may one day engage you in theoretically endless conversations, if a new internal…
The latest 15-inch MacBook Air is bluer and better than ever before—and it dropped in…