Signed in as:
filler@godaddy.com
Signed in as:
filler@godaddy.com
As the lead for the OCR project, I was responsible for guiding the team in creating the requirements, conducting data analysis, and creating golden copies. We also performed thorough testing to ensure that the data extracted was accurate. The primary goals of this project were to accurately classify documents and extract the necessary data to support the automation of income and employment underwriting.
Initially, the project began with a heuristic rule engine to identify and define each field or piece of information within the documents. However, in a second iteration, we implemented a machine learning and deep learning model to enhance data extraction capabilities. We successfully performed OCR on 14 different types of documents, extracting over 400 fields, including data from tax returns, W-2s, pay stubs, bank statements, and investment accounts. Our accuracy rate for W-2s and PSUs was over 90%.
This robust data extraction enabled additional functionality within the underwriting process, contributing to the automation of income and employment underwriting and improving overall efficiency and accuracy.
Copyright © 2024 Jason Arndt - All Rights Reserved.
Powered by GoDaddy
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.