Solutions that allow document recognition through artificial intelligence (AI) are becoming increasingly popular, because thanks to them, we save a significant amount of time. One of the popular choices for Entrepreneurs is ABBYY FlexiCapture, whose test I took with other members of The Story. How good is the popular OCR (Optical Character Recognition), which has been adopted to improve accounting systems?
Let’s start with the important info: the system doesn’t work straight "out of the box". It requires configuration and creation of an appropriate data infrastructure.
System configuration is complicated. This is due to the enormity of available parameters and adaptability, as well as the need to configure additional environments, such as input databases for the proper functioning of the OCR system.
The system must receive access to the existing database of allowed contractors, orders, shopping items, etc. Everything needed to ensure the correct operation of reading algorithms and gain even greater insights into the company's purchases.
OCR (Optical Character Recognition) - a set of techniques or software for recognizing characters and entire texts in a raster image file (bitmap).
The main assumption of the OCR system is that the delegated person should participate in the entire workflow. Its task is to verify the results of the system as well as continuously making improvements of machine learning processes.
OCR invoices - processing. Proof of Concept for the ABBYY solution
We allocated two business days for the OCR invoice test. The test was conducted using a collection of 15 scanned and 20 normally generated Polish invoices.
We have the opportunity to test the machine learning system with a larger data set, i.e. about five thousand invoices, and experience the majority of the work carried out by the person who teaches the algorithm using a special tool that supports machine learning.
In this OCR test, we used the ABBYY FlexiCapture solution.
FlexiCapture is available in a server version, but it’s also possible to host it in the AB cloud. Hosting in the cloud eliminates maintenance costs and the need to update the software.
Due to the complexity of the application, and the large amount of work needed to configure the test environment, I chose exploratory tests as the main analytical method.
Enterprise - software that solves a business problem, which is characterized by, among others, security, scalability and modularity
The analysis uses the cloud version of FlexiCapture software on an R2 engine - an enterprise system for OCR containing semi-structured documents, the addition of which facilitates the recognition of invoices from Europe.
The planned OCR tests were divided into the following groups:
- Environment configuration;
- Machine learning mechanisms;
- Invoice recognition;
- Recognition of other documents;
- Exporting data (e.g. personal data on the invoice);
- API communication.
All OCR tests were performed using the following system:
- Windows 10 Home (Version 1903).
- 16 GB RAM.
- Intel Core i7 2.80 GHz.
Tasks were carried out in the following software:
- FC in the Cloud version.
- FC Administrator Station.
- Microsoft SQL Server Express 2017
OCR invoices: machine learning mechanisms
ABBYY FlexiCapture uses machine learning mechanisms to extract data from images.
Like any AI of this type, FlexiCapture needs training by a database and with the help of an operator who indicates to the artificial intelligence where specific data is in the image. Teaching in this way always happens in two stages.
Firstly, we "throw" the invoice to OCR, verify and correct it, and finally forward it to AI. Secondly, the system operator must pair the data or improve recognition of the document in a special learning tool.
The tool is simple and effective - data can be selected both from the image level and from the level of read data. This allows you to quickly find errors and make corrections.
The last stage allows the operator to choose which of the corrected invoices should be transferred to AI training.
In my opinion, leaving the decision to the operator makes sense: not every amendment results from the lack of an AI pattern. Reading errors can be caused by many factors, such as image quality and ambient light during scanning.
Let's not forget, FlexiCapture is also characterized by extensive logic designed to recognize patterns typical on invoices, such as the VAT ID.
Document: online recognition
The team used the following set of documents for invoice recognition tests:
- 15 traditional Polish invoices scanned;
- 20 traditional Polish invoices generated.
The diagnosis was tested for the following elements on the document:
- Buyer and seller details on invoices, including tax identification number.
- Document Date.
- Document Number.
- Total Purchase Amount.
- Items on the document.
- VAT rates.
- Bank account number.
Online data recognition of contractors by OCR
The personal data of contractors were found without major problems (100% efficiency), as long as they were in the database. FlexiCapture looks for contractor data from the company's database in the invoice image and returns the data from the database after recognition.
How the OCR system recognizes invoice issue dates
The date of issue on documents were read correctly 90% of the time. Dates containing the name of the month caused particular problems, e.g. May 5, 2019, made it necessary to manually indicate the area with the date.
After the operator indicates the area, the data is read correctly every time.
Invoice scanning: invoice numbers, i.e. integration with OCR
ABBYY FlexiCapture did not always detect invoice numbers the first time. But after indicating where the invoice number was on the document, FlexiCapture coped with no problems. On 35 documents, after the second attempt, the program achieved 90%.
Only in 10% of cases could the AI not be trained properly. This is probably related to the AI only having a small amount of training data to work with.
OCR invoices: goods items
The items on the document were detected with decent accuracy. In 80% of test cases the individual items were read almost perfectly.
Although, the system was not always able to correctly divide the data into columns and there were also typos resulting from image quality or typeface. However, results were improved with the help of AI training tools.
The test environment was not integrated with the ordering system and the purchasing items database, which is why the system was completely based on data from OCR. I believe that if the bases were connected, the results would be improved significantly.
Detection of VAT rates in the OCR system
90% of VAT rates were detected correctly by the system. As with other elements, machine learning has been able to significantly improve the results on subsequent contractor documents - even with such a small set of documents.
Improvement of accounting systems: bank account number
The FlexiCapture program's bank account number recognition was already correct 70% of the time, before any additional learning.
Scanning and processing. How ABBYY OCR detects information on invoices
Just a quick reminder that together with the team, the test was performed using a collection of 15 scanned and 20 traditionally generated Polish invoices.
The results of this quick test showed that ABBYY OCR detects information on invoices as follows:
- Buyer and seller details on invoices: 90%
- Document date: 90 percent (after operator interference - 100%)
- Document number: 90%
- Total purchase amount: 90%
- Freight items on the document: 80%
- VAT rates: 90%
- Bank account number: 70%
I would like to emphasize, however, that in a few cases the results would have been better if the system received more training data and the test lasted longer. The rating would also be improved if the AI had access to more data about the company.
REST (Representational State Transfer) - a style of software architecture that is based on a set of predefined rules. These rules describe how resources are defined and also allows access to them.
Also, the specificity of the FlexiCapture program is not limited to reading invoices. It also has powerful data validation and configuration mechanisms. It can be used to read any documents, such as forms, surveys, ballot papers, etc. Anything with a specific structure can be processed by FlexiCapture.
API (Application Programming Interface) - a set of rules that define communication between computer programs
In addition, ABBYY allows integration with any IT solution using the REST API.
Main photo: PxHere.com