For several years, businesses have used Google Cloud’s Document AI to achieve faster, more accurate document processing, and improve the ways they process invoices, customer forms, and deliver services that rely on documents.
Generative AI is transforming enterprise document processing by letting users input natural language prompts to classify, extract, and get deeper insights from documents, all with high accuracy and limited-to-no machine learning (ML) training. We’re pleased to bring generative AI to Document AI, unlocking powerful and more efficient ways for organizations to structure, manage, and get insights from documents.
Document AI Workbench enables users to customize models for document processing tasks. In February 2023, we launched the Custom Extractor in General Availability (GA) to help users extract structured data from documents. In March 2023, we launched the Custom Classifier in GA to help automatically classify document types. In July, we launched the Custom Splitter in GA to help automatically split and classify multiple documents within a single file.
At Google Next ’23, we built on this momentum by announcing the public preview launches of two generative AI-powered features in Document AI Workbench: a version of Custom Extractor that uses foundation models, and Summarizer.
Generative AI-powered extraction can help pull data from documents with lots of free form text (like contracts), complex layouts (such as invoices, w2s, and bills of lading), or little or no training data available. Now that a foundation model is available in the Custom Extractor, users can call the endpoint with any document and get structured data quickly, without any configuration required.
Summarizer can be used out of the box without training to provide summaries for documents up to 250 pages long. Most generative AI solutions do not have context windows that can support long documents, requiring that information be broken into small chunks, but Summarizer removes these concerns, making it easy to generate custom summaries based on the user’s preferred length and format.
Here’s what some customers with early access are saying:
“Deutsche Bank (DB) divisions are digitizing high volume documents and extracting data using the Document AI Workbench Custom Extractor for simple, scalable use cases such as KYC and payment forms. Automating the content review process leads to reduced operational risk, increased capacity and a better customer experience. With the introduction of generative AI to Workbench, we hope to automate more complex documents with reduced time to train a model and explore new use cases for faster intelligence such as Q&A and summarization.”
-Inwha Huh, Managing Director – Corporate and Investment Bank Transformation, Deutsche
“BBVA is committed to providing our customers with the best possible experience, and that includes using AI to automate our business processes. By using generative AI now available in Document AI Workbench, we will extract data in complex, highly dense and non-structured documents and prevent errors and potential fraud. This will allow us to provide our customers with a faster, more accurate, and more secure service.”
– Antonio Valle, Global Head of Intelligent Process Automation, BBVA
You can read more about how many other customers are using Document AI Workbench and its generative AI features in this detailed blog post. To get started, customers can visit Document AI Workbench within the Google Cloud Console or view our Custom Extractor and Summarizer demo videos online.
In October 2022, we announced the GA of Document AI Warehouse, a fully managed cloud-native service to search, store, and govern documents and their extracted data.
We’re now bringing customers the best of Enterprise Search technology from Google Cloud integrated into Document AI Warehouse, where users can retrieve documents containing answers to their natural language questions. Generative AI also helps summarize the answer from each document, which saves users hours in finding the right answer, as shown in the animated image.
Here are the new features, powered by generative AI to help organizations better manage their documents:
More details can be found in the Feature Documentation for Trusted Tester Program members with Private Preview access.
The combination of LLMs and Optical Character Recognition (OCR) marks a significant advancement in data processing and analysis. By leveraging LLMs’ ability to understand context and OCR’s text and layout extraction capabilities, businesses can unlock valuable insights from data and streamline workflows. Enterprise Document OCR v2.0 represents the latest evolution in Document AI’s OCR technology, offering businesses a powerful extraction tool for better downstream processing.
With Enterprise Document OCR v2.0, users can take advantage of:
On top of this, Enterprise Document OCR v2.0 now offers premium OCR add ons which users can enable based on their desired processing or quality requirements. These include:
The versatility of Enterprise Document OCR v2.0 provides a strong foundation for LLM-driven applications, ensuring rich, secure, and highly accurate text and layout extraction and in LLM-powered applications, high-quality OCR is paramount. Ryan Walker, Chief Technology Officer at Casetext, attests to the importance of OCR quality:
“As a creator of legal AI solutions—most recently our AI legal assistant, CoCounsel—we build products that must correctly process large, complex collections of legal documents. These might be thousands of pages long, contain images, or be poorly scanned. Missing even a single word can make the difference between winning or losing a case. Google’s OCR accurately extracts text from files far better than every other system we’ve evaluated. Incorporating this technology into our products lets us deliver the highest-quality answers for the lawyers who rely on us, which in turn means they’re able to deliver the best possible service and results for their clients.”
Explore the potential of Enterprise Document OCR v2.0 to streamline your document understanding workflows.
We’re very excited about what the future holds for Document AI as a platform for businesses to simplify document automation. Learn more about all these exciting developments in our session at Next’23 or try out one of our offerings today.
TL;DR A conversation with 4o about the potential demise of companies like Anthropic. As artificial…
Whether a company begins with a proof-of-concept or live deployment, they should start small, test…
Digital tools are not always superior. Here are some WIRED-tested agendas and notebooks to keep…
Machine learning (ML) models are built upon data.
Editor’s note: This is the second post in a series that explores a range of…
David J. Berg*, David Casler^, Romain Cledat*, Qian Huang*, Rui Lin*, Nissan Pow*, Nurcan Sonmez*,…