Better when we are together- why Optical Character Recognition (OCR) is at its best when combined with AI

Stemming from the 1930s, Optical Character Recognition (OCR) was originally a machine that searched microfilm archives using optical code recognition. It was called a ‘statistical machine’, though IBM later acquired this early iteration.

Early versions had limitations, as they needed to be trained with images of each character and worked on just one font at a time. Track forwards and OCR technology became popular in the early 1990s for digitising historical newspapers.

Since then, technology has undergone several improvements. Today’s OCR converts scan or print texts and images into digital, machine-readable texts that can be edited, filed, altered, and shared.

OCR can now produce a reasonable degree of accuracy and cope with most fonts. Some systems can also reproduce formatted output that closely approximates the original page, incorporating features like images, columns, and other non-textual components.

Examples of this capability include scanned documents, a photo of a document or other image with text on it, and crucially, forms filled in by hand.

This progression means that OCR is now used to automate complex, document-based workflows common in financial services. It has been widely used as a form of data entry from printed paper data records such as those used in onboarding like passports, bank statements, and invoices. It is an invaluable tool in digitising printed texts so that they can be electronically edited, searched, stored more compactly, and displayed online. OCR can also be used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data, and text mining. OCR is also a field of research in pattern recognition, artificial intelligence, and computer vision.

Clearly, this is useful in finance. But there are limitations!

Accuracy is an issue with an estimated 85 to 90% rate. This limitation is crucial in financial services generally, but particularly in wealth management, where KYC, suitability, onboarding, and ongoing customer service all need to be accurate for compliance purposes and service levels. Imagine getting the customer’s name or date of birth wrong when setting up an account? Or adding an extra zero to a balance? This means that human intervention and checking are needed, adding to an already onerous human admin burden.

But in conjunction with other digital technologies, in particular, AI and ML, OCR can be supercharged. ML can be leveraged to automate the scrutinising, clause-matching, and rules- and compliance-checking processes. The combination of technologies automatically detects all necessary data fields on documents and extracts these, reducing the processing time of a copy by 90% or more. The more fields you need, the more time you save.

In combination with OCR, natural language processing (NLP) is used to achieve an enhanced client experience. It can give computers the ability to understand text and spoken words the same way humans can. And thus, it can make meaningful decisions on the back of that. For example, it can be used for onboarding, KYC, and suitability, as well as gathering sentiment analysis for the selection of stocks, providing customer support, and creating a knowledge base for advisors.

Thus, once you add in other technologies, OCR becomes fit for purpose.

HyperScience
HyperScience, our solution, can extract information from virtually any document format, including emails, faxes, PDFs, images, invoices, and forms. Our software can identify and pull out the necessary information even if documents are handwritten or in poor condition. For instance, if the text has messy handwriting or crossed-out text, which would cause legacy OCR tech to fail. It is reliable when extracting and structuring pieces into a JSON format for faster, more accurate downstream processing. Thus, an organisation does not need to manually input handwritten documents from those who cannot or will not fill in forms online.

For handwritten and machine-typed text, HyperScience uses a single extraction model versus most legacy solutions which require an operator to select which model to use, limiting scale and automation rates.

Moreover, it can read white text on black backgrounds, patterned or textured backgrounds, and scan/fax distortions, which would stump legacy tools. Being able to do this accurately and automatically significantly impacts operational efficiency. HyperScience applies powerful ML tools that ignore unnecessary fields and descriptions. This means that only the required data is extracted.

This is not just from a data input point of view but also from a filing, storage and sharing viewpoint. Suppose users can easily find any kind of document simply by typing the details into a search bar. In that case, the process becomes efficient, resulting in people having more time to do more value-added tasks, such as spending time with the client and upping service and satisfaction levels.

HyperScience can also be choosy! It can use your entire document library to pull metrics on any area of your business, allowing users to identify strengths, weaknesses, and opportunities for development.

Customisation and operational efficiency
All of this, of course, needs to be configurable to individual business needs, and the combination of technologies set to the requirements of the user makes for a powerful offering. Users can set limits and parameters and determine the amount of human oversight. In this way, users can customise the HyperScience processes to mould into existing procedures, configure accuracy thresholds, automatically alert team members when action is needed, and manage quality assurance capabilities with confidence.

If you wish, you can further refine your search by also selecting the relevant marketplace, business need and resource type

If you wish, you can further refine your search by also selecting the relevant marketplace, business need and resource type