SHAPING THE FUTURE OF DOCUMENT PROCESSING: ROSSUM PRESENTS THE WINNERS OF THE DOCILE COMPETITION
LONDON, Sept. 25, 2023 /PRNewswire/ -- Rossum, a leader in the Intelligent Document Processing industry, is thrilled to reveal the remarkable results of its groundbreaking DocILE (Document Information Localization and Extraction) competition. This global event, which kicked off in February 2022, has left an indelible mark on the field of document processing.
Rossum launched the DocILE initiative in 2022, granting access to a treasure trove of over 6,700 meticulously annotated business documents in addition to 100,000 synthetically-generated documents.
This unprecedented benchmark dataset served as the litmus test for participants worldwide, enabling them to measure their solutions against established methodologies. Over the course of a year, diverse teams harnessed this dataset to sharpen their prowess in pinpointing critical data, such as VAT numbers and company addresses, within semi-structured business documents.
The competition concluded on May 24th of 2023 and attracted a wide range of submissions. Participants showcased their innovations by creating varied approaches to tackle the complex challenges inherent in document information extraction.
A team fro the University of Science and Technology of China and iFLYTEK Research Institute presented a method called 'GraphDoc' and took the first place by achieving top honors in both Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) tasks, surpassing other participants by a significant margin.
Their success was driven by an innovative use of transformer architecture, which gave them a head start in the competition. They introduced a noteworthy technique that involved learning which words have to be combined to get the correct extracted value, and leveraged heuristics based on data trends to further enhance their results.
The competition saw a mix of different methods, with some relying on computer vision and others on transformer architectures, demonstrating the rising popularity of the latter in the field. More importantly, the competition demonstrated that it's necessary to understand the document simultaneously as an image and as the text it contains since purely computer vision methods and traditional transformers working only with the text cannot achieve the same performance.
By combining these two approaches, participants were able to achieve a deeper and more accurate understanding of complex business documents, where computer vision addressed specific challenges while transformers handled different aspects. This emphasized the need for a comprehensive strategy that takes into account both the text and visual structure of documents for precise interpretation.
Štepán Šimsa, Research Scientist at Rossum, expressed his enthusiasm for the competition's impact, stating, "The DocILE initiative has not only spurred groundbreaking research but has also facilitated industry collaboration and innovation. By bridging methodological gaps, we're empowering the Intelligent Document Processing community to develop solutions that revolutionize business operations."
As part of the competition, the participants had to open-source their code and publish a paper describing their applied method. The prize pool comprised of $8,000, of which $6,000 went to the winning GraphDoc solution as it received the first place award as well as the 'Best Paper Award'.
This competition embodies Rossum's unwavering mission to accelerate the evolution of the Intelligent Document Processing field on a global scale, establishing a benchmark for document understanding. This initiative serves as a catalyst, igniting the creation of novel techniques that enhance the precision and efficiency of document information extraction—a testament to Rossum's core values of innovation and excellence.
About the DocILE Initiative
DocILE was launched in August in California at the biggest document understanding conference 'International Conference on Document Analysis and Recognition' (ICDAR), where it raised significant interest in the research community, promising a faster development of AI techniques that can revolutionize IDP.
View original content to download multimedia:https://www.prnewswire.com/news-releases/shaping-the-future-of-document-processing-rossum-presents-the-winners-of-the-docile-competition-301936238.html
ITEXPO #TECHSUPERSHOW Booth Crawl
Continental Breakfast Served