At Google Cloud Next, we announced Jump Start Solutions. Each Jump Start Solution is a simple-to-deploy architecture and application that makes it faster to get started with Google Cloud. These pre-built solutions also include interactive tutorials and a guide to teach you all about the products used and help you learn how to modify the solution for your use case.
Let’s look deeper at one of the Jump Start Solutions: Generative AI Document Summarization. Yvonne Li from the engineering team answered questions and shared her insights into the factors that influenced the design, the challenges they faced, and what she recommends for learners who want to modify this solution.
[Interviewer] What types of use cases and problems can the Jump Start Solution address?
[Yvonne] Here is a classic scenario this Jump Start Solution is designed to help.
Many large enterprises have countless documents stored as PDFs. Whenever employees need to locate data, they must visually scan through files. This process can be frustrating and time-consuming for employees and costly for the company.
Generative AI Document Summarization leverages Vertex AI generative AI large language models (LLMs) to process and summarize documents on demand.
[Interviewer] What should folks expect to learn or be able to do once they’ve deployed this JSS?
[Yvonne] By deploying the Generative AI Document Summarization solution, you will be able to:
[Interviewer] Why did you pick the architecture, frameworks, and languages you did?
[Yvonne] We chose the Vertex AI PaLM API because it supported our use case of accepting and summarizing ad hoc user submissions.
For this Jump Start Solution, we picked Cloud Functions as the process runner over Cloud Run. Here are a few reasons why:
As for the language, we chose Python as it is a popular choice with data scientists and ML practitioners. The Python SDK made it very easy to work with the PaLM API.
[Interviewer] Did you run into any interesting challenges? How did you overcome them?
[Yvonne] We ran into issues with document preprocessing handling a wide variety of input content. Data cleaning is a pain and concern because directly passing the information after an optical character recognition (OCR) scan into LLMs would not provide an informative result.
In the current solution, we assumed the input file is similar to a research paper. It has different sections in the content, and we extract and preprocess those sections using a deliberate heuristic.
[Interviewer] What changes would you make, and what would you add if you were going to take this solution to production?
[Yvonne] I would change the data preparation process before ingesting data into PaLM 2. Currently, we assume that the input PDF is similar to a research paper, and we manually clean the data subsections into an abstract, a conclusion, and others. However, in real-world scenarios, you may need to adapt this process to fit your specific data needs.
For more consistently structured input PDFs — say, like forms — a better option would be to use Document AI.
[Interviewer] What pleasantly surprised you?
[Yvonne] The cost of this Jump Start Solution (including calling the PaLM model) is reasonable, considering the application’s capabilities and the resources required to run it. But be careful; the cost of this application depends on the size of the input PDF file.
[Interviewer] If folks want to learn more, are there any additional resources you recommend?
[Yvonne] People who want to learn more about Generative AI can check out the Generative AI for Developers Learning Path.
To try out the Document Summarization with Generative AI Jump Start Solution, you can deploy it from the solution catalog. You can also read the guide or look at its code on GitHub.
Discover how Jasper’s CAB and inclusion in the NYSE LaunchPad program are driving innovation for…
We present Matrix3D, a unified model that performs several photogrammetry subtasks, including pose estimation, depth…
Kinesh SatiyaIntroductionIn a digital advertising platform, a robust feedback system is essential for the lifecycle…
In the media and entertainment industry, understanding and predicting the effectiveness of marketing campaigns is…
From retail to gaming, from code generation to customer care, an increasing number of organizations…