Automation of PDF documents

    Channel 2: Fully automated processing of PDF documents

    DIG.easyconnect as a channel offers exactly what the name promises: incredibly easy access to full EDI convenience via the DIG Portal, so that these incoming documents never have to be entered manually again!
    Depending on the specific requirements, data from PDF documents is converted using OCR or deep OCR (both technical solutions are available) and automatically checked, processed and transferred to the target system in the desired format on the DIG portal, depending on the customer's requirements.

    What is OCR?

    OCR stands for Optical Character Recognition, or commonly known as text recognition. Success rates of up to 99% are technically possible after appropriate training phases, meaning that manual and repetitive tasks relating to document entry and invoice recognition can be largely automated. In addition to "normal" OCR, DIG also offers deep OCR, in which artificial intelligence eliminates the need for training phases.

    Automated processing of PDF documents

    Orders, e.g. via our eProcurement systems, result in incoming PDF documents. For example, order confirmations, invoices, delivery bills and incoming customer orders that need to be processed automatically. In the first step, these are converted into machine-readable text using OCR. The PDF documents are analyzed by special software and the results are checked for validity and read rate. The success rate depends on the class, quality, structure and formatting of the original document.

    The extracted data is structured by converting it into an XML file that can be used to exploit all the advantages of an EDI document flow.

    • Order confirmations
    • invoices
    • delivery notes
    • Incoming customer orders

    Delivery note recognition in practice

    Whether on construction sites or in different warehouses: delivery bills often go through a lot before they are scanned! At item level, the supplier's article numbers often do not match your own. DIG therefore enables automated assignment of the delivery bill to the order in the ERP (using an OCR-recognized order number). This not only makes checking easier, but also enables a learning system that derives the ability to automatically check these items the next time, despite different article numbers, from the manual assignment of the individual items !

    How OCR learns to read PDF documents correctly

    With standard OCR, the recognition reliability of the fields and reading of the values is marked with a point value. This is necessary because documents from different senders are formatted differently, meaning that individual fields and items can contain completely different data. A threshold value can be defined above which manual validation should take place. The employee responsible checks the content in a web interface and makes any necessary corrections. In this way, the system learns continuously by marking and saving unrecognized fields in the graphical user interface. As a cloud solution, the software learns from the input of all users and therefore offers everyone faster learning progress.

    DIG opens up an even faster and more successful path with the integration of deep OCR: this innovative type of document analysis uses AI (artificial intelligence) to train the correct capture of data in documents with different layouts. For example, item data from incoming invoices is transferred directly to the accounting software so that posting records can be created directly.

    DIG training support for standard OCR

    DIG supports OCR training for recognizing data from PDF invoices and other documents by checking the individual PDF documents that do not produce satisfactory results. In the first step, optimization possibilities through system settings are checked and discussed before implementation. If required, the recognition rate can also be maximized with individual programming to such an extent that, as a rule, only individual cases need to be validated in the company.

    Test the magical reading of your PDF documents!

    See for yourself how OCR and deep OCR can deliver fantastic results and full EDI convenience with exactly your documents, not only saving time but also avoiding transmission errors!