Categories: FAANG

Processing W2 & Payslips is now even simpler with Document AI

Documents like payslips and W2s are crucial to processes such as employment and income verification for mortgage loans, personal loans, personal finance, and benefits processing. Unfortunately, efficiently extracting data from these documents at scale can be challenging and time-consuming, with many organizations relying on manual examination of documents or automated approaches that don’t adequately capture the document data needed for given tasks. Google Cloud built Document AI to remove these barriers, empowering customers to deploy powerful machine learning models to more quickly process documents, save money, and discover insights. We’re excited to expand Document AI’s capabilities with the recent release of improved pre-trained models for W2s and payslips, built on Document AI Workbench.

Pre-trained models let developers focus on core application logic and leave the complex task of information extraction from the documents to Google’s AI technology. In many cases, the primary driver for automated data extraction is operational efficiency and cost savings, but Document AI can also open new possibilities. For example, a financial services company might use Document AI to enable fully self-serve loan applications on mobile devices, helping the organization to differentiate itself with simple, fast customer experiences.

We’ve heard from customers that more granular entity extraction from W2 and payslip documents is particularly important, with organizations requiring support for a wider variety of layouts and formats. The recent launch of the stable release of these pretrained models addresses these requests. 

Here is what is new with W2 parser:

  • The parser improves accuracy and entity specificity thanks to the ability to break down long entities such as addresses into fine-grained sub-entities like StreetAddressOrPostalBox, AdditionalStreetAddressOrPostalBox, City, State, and ZIP code. 

  • It can handle a wider variation of W2 forms, including multi-copies (2,3,4-ups) issued by various payroll vendors. The model is not limited to specific tax years, which means it should be able to process W2 for 2022 or beyond provided there are not significant changes to the format.

  • It introduces eight new entities for Box 12 that represent both codes and values, enriching understanding of the various taxable and non-taxable components of the W2 recipient’s income.

Here is what is new with Payslip parser

  • Bonus, commissions, holiday, overtime, regular pay, and vacation are now part of earning_item/earning_this_period and earning_item/earning_ytd. The parser captures types of earnings beyond those categories, and maps them to their respective earning rates, hours, and pay (both for the period and year-to-date). This helps in building a more detailed understanding of the components of the payslip recipient’s income

  • The parser now returns year-to-date and current-period taxes and deductions.

  • Direct deposits are linked to corresponding bank account numbers.

  • The parser now returns page numbers, state and federal tax exemptions, and filing statuses.

While these parsers have become more useful out of the box, with this release, the ability to uptrain makes them easy to modify as new needs arise. Uptraining lets developers further improve the accuracy of these models and extract additional fields with minimal development work. It also lets developers customize existing parsers to support new document types that are similar. For example, the parser is trained on U.S. data and could be uptrained to create a payslip parser for the U.K. 

We’re pleased that parsers are already making a difference for customers. Bryan Jackson, CTO at lending automation firm Gateless, said, “High accuracy data extraction is critical to the success of our Smart Underwrite solution, and Document AI provided better results than competitors. Using the latest W2 & Payslip pretrained parsers, we saw a 48% increase in performance on pay stubs and a 15% performance improvement in W2s. The ability to easily uptrain models as new document variations are introduced ensures we continue to deliver optimal outcomes for our customers.”

Additional pre-trained models available as release candidates include parsers for 1040, 1099R, 1120, and 1120S documents. Check for details here.To learn more, talk to a Google Cloud sales executive about how Document AI can help your business, and check out our Document AI breakout session from Google Cloud Next ’22


Thanks to Wael Farhan and Carl Saroufim for their contributions to this post.

AI Generated Robotic Content

Recent Posts

10 Podcasts That Every Machine Learning Enthusiast Should Subscribe To

Podcasts are a fun and easy way to learn about machine learning.

17 hours ago

o1’s Thoughts on LNMs and LMMs

TL;DR We asked o1 to share its thoughts on our recent LNM/LMM post. https://www.artificial-intelligence.show/the-ai-podcast/o1s-thoughts-on-lnms-and-lmms What…

17 hours ago

Leading Federal IT Innovation

Palantir and Grafana Labs’ Strategic PartnershipIntroductionIn today’s rapidly evolving technological landscape, government agencies face the…

17 hours ago

How Amazon trains sequential ensemble models at scale with Amazon SageMaker Pipelines

Amazon SageMaker Pipelines includes features that allow you to streamline and automate machine learning (ML)…

17 hours ago

Orchestrating GPU-based distributed training workloads on AI Hypercomputer

When it comes to AI, large language models (LLMs) and machine learning (ML) are taking…

17 hours ago

Cohere’s smallest, fastest R-series model excels at RAG, reasoning in 23 languages

Cohere's Command R7B uses RAG, features a context length of 128K, supports 23 languages and…

18 hours ago