首页 >> NEWS >>Industry information >> Acquisition of unstructured data - OCR

Acquisition of unstructured data - OCR

时间:2022-04-15     【转载】

Related technologies for data acquisition - Optical character recognition characters

In the application, RPA usually encounters a problem in the first data acquisition step, such as feeding the robot a scan or an image. How to process it? This requires optical character recognition (OCR) technology.


optical character identifier

The so-called Optical Character Recognition (OCR) technology refers to the process of determining the shape of text by detecting the dark and light patterns on the scanned piece of text based on the text of an electronic device (such as a scanner or digital camera), and then translating the shape into text using character recognition methods. The whole process involves first scanning the paper text material, then analyzing and processing the image file, and finally obtaining the text and layout information.


Since enterprise employees still need to deal with the real physical world in the process of doing business, from small invoice recognition, document recognition, bank card and ID card recognition to the recognition of advertisements and posters, while RPA cannot read these image information directly, OCR technology is needed. In addition, if there is a need to recognize remote desktops or the fields of local desktops are not available, OCR technology is also needed to recognize them. For example, automated applications in the finance field often require the use of OCR technology for invoice recognition and processing.

Traditional OCR technology still needs to rely on manual judgment and correction, especially for handwritten text, seals, overprints, embossing, etc. The recognition rate is not high. Although OCR technology has been developed for many years and is also widely used in bill centers, document centers and financial sharing centers of financial institutions, manual intervention is still inevitable until today. How to reduce the number of manual interventions and how to make the processing more convenient after manual interventions are the issues that experts in the field of automation need to consider.



In the automation field, we mainly solve the OCR recognition rate problem through two directions. One is the technology direction and the other is the business direction.

Technology Direction

    That is, the recognition rate is improved by combining artificial intelligence technology with OCR technology, especially for special characters, such as handwriting and embossing. The term Intelligent Character Recognition (ICR) has thus been coined.


    Most ICRs come with a self-learning system that automatically updates the recognition library with machine learning (ML) and convolutional neural network (CNN) techniques, and progressively develops the required neural network model by pre-labeling and training a large set of characters. In addition, ICR can also perform recognition by configuring different recognition engines and calibrating them against each other. Each engine is given selective voting power to determine the trustworthiness of the characters. Because the expertise of each recognition engine is different, some are good at recognizing numbers, some are good at recognizing English, some are good at recognizing Chinese, and so on. Therefore, users need to automatically select recognition engines or configure the voting weights of different engines based on the type of content to be recognized.

Business Direction

    In addition to the technology direction, another is the business direction, that is, the use of business management tools to help OCR to improve the recognition rate. For example, using a uniform overhead camera or scanner to acquire images according to specifications, while avoiding individual cell phone shots due to differences in cell phones, shooting angles and light differences that lead to lower recognition rates. For example, adding a pre-calibration function, i.e., excluding those scans with low recognition rate in advance and transferring them directly to manual processing, while avoiding the process to enter a large batch and then be processed by manual processing. For example, the use of direct attachment of already cut image slices on the user interface of the system that needs to be compared, which avoids the process of switching back and forth between two screens for the user to find comparison elements. There are many similar business adjustments and management tools, all with the ultimate goal of reducing the workload of business staff and improving the quality and efficiency of work .


Finally, if enterprises still find OCR technology difficult to implement and master, they can also make use of the cloud services provided by some Internet companies, such as Tencent Cloud's text recognition provides recognition of ID cards, business cards, bank cards, license plates, driving licenses, business licenses, general handwriting, general printing, and provides two billing models, postpaid and prepaid; Baidu Cloud's text recognition also provides network recognition of pictures, train tickets, and cab tickets. The cost of OCR per recognition using cloud services is relatively low, so if the enterprise does not have a large amount of information recognition, it can also consider using cloud services in combination with RPA to use together.


If you are interested in RPA related courses, you can focus on our WeChat official account, ATA technology, check out the training videos of RPA courses, and check our video content in the official account to better understand RPA. You can also follow our subscription number to learn more in real time.


WeChat swept away the official account of "ATA technology"


WeChat swept away the official account of "RPA digital workforce"