Introduction to Scanning Software
With a growing number of businesses making the environmentally friendly and financially rewarding decision to digitize their documents, a wide variety of software has become available for facilitating and automating this transition. These include:
Batch Scanning applications for tagging and organizing documents as you scan
Image Processing software for clarifying and enhancing scanned images
Optical Character Recognition (OCR) software for digitizing machine printed text
Document Management software for viewing and modifying indexed documents
Forms Processing software with Intelligent Character Recognition (ICR) for automating data entry from forms and surveys with hand-printed data or from semi-structured forms like Invoices
Traditionally, batch scanning software was designed for large companies with dedicated scanning departments. In the last few years, a number of software developers have created low-cost, easy to use scanning software designed for single-user desktop scanning. These applications have given small businesses, branch offices and departments the ability to effectively digitize their documents, since they have neither the high cost of enterprise scanning software nor the manual labor required by low-cost scanning tools.
Comparing Batch Scanning Software
While desktop users now have a wide variety of solutions to choose from, it is often difficult to find the software that both satisfies the project requirements and fits the budget. This difficulty is doubled by the fact that not all software is created equally, and two different applications designed to do the same job might be very different in terms of processing speed, accuracy, versatility, and ease of use. Without trying each application individually (a time-consuming and expensive proposition) how are you supposed to decide which package has the fastest workflow or best accuracy?
With that in mind, ScanStore has compiled a comprehensive Scanning Software Comparison Chart and conducted an in-depth comparative evaluation of the top selling Desktop Batch Scanning applications. By running a standardized sample of documents with each program, configured for the same end result in the shortest possible processing time, we have come up with a reliable appraisal of their strengths and weaknesses. Our setup approach was from the perspective of an informed but non-expert user attempting to use each software package for the first time.
The following desktop batch scanning applications were evaluated in this test. Here they are with a brief description taken from marketing materials.
SimpleSoftware - SimpleIndex
How many clicks does it take to scan your documents? With SimpleIndex, the answer is one! By bundling a powerful OCR engine with a versatile configuration editor, you can automate most common jobs to be scanned, processed, and indexed in as little as one click from the end user.
Digitech Systems - PaperVision Capture Desktop
Capture and control critical business information stored on paper right from your workstation, an enterprise-grade document capture solution. Basic scanning and indexing tasks are easy with the Microsoft® ribbon interface, and more advanced processing and export can be achieved through the built in script editor.
I.R.I.S. - Powerscan
Scan, structure, sort, index and convert volumes of documents with a powerful, easy-to-use production scanning and OCR solution that supports the most popular high-speed scanners.
Kodak - Capture Pro
Efficiently capture critical index data, then automatically deliver it all to databases and applications. Benefit from new advances in quality control to identify and adjust challenging images without rescanning, enhanced integration with Microsoft SharePoint, and more practical innovations.
Kofax - Express
Scan, organize, and store documents at speeds that make short work of batches big and small. Easy enough for beginners, powerful enough for pros. Let Kofax show you the new Express route to document scanning.
Office Gemini - Diamond Vision
Get powerful production level image and data capture with no per click charges or limitations on scanning and OCR functions with Diamond Vision, an all inclusive scanning solution with modular, multi-user functionality.
Benchmark Configuration Settings
Each application was configured to perform following tasks:
Scan ten one-sided sample invoices with one company format
Separate the pages into documents using a barcode
Extract a barcode to an index field
Extract two OCR data fields from a fixed location (Zone OCR)
Validate the OCR data with a database lookup
Fill in two more index fields with data from the database
Export a full-text OCR in PDF form
Name the output file according to two of the index fields
The goal was to make the actual indexing process as simple as possible for the end user, automating as many of these steps as possible through OCR and database lookup.
Comparing the Job Setup Process
SimpleIndex has two ways of setting up a scanning job: through a wizard and through a multi-tabbed options window. There are many setup options, arranged in seven tabs full of checkboxes, dropdown lists, and text fields. These may appear daunting to the inexperienced user, even though they do have short but cogent explanations of their respective functions when you hover over them. The wizard somewhat simplifies the setup process. The most useful and unique feature of this software is the Dynamic OCR, which allows you to recognize a large section of text and pull out a shorter string that fits a given template, negating the need to draw perfectly placed OCR zones.
Papervision Capture Desktop
PaperVision Capture Desktop has an exponentially more difficult learning curve. Basic tasks like setting up scanning options and index fields are as simple as any of the other software. When it comes to database lookup and exporting, the software requires a more expert approach. Database lookup can only be done with a SQL server, not a simpler Access files or ODBC data sources like the other software. Exporting requires custom coding. While these features give more power to the experienced end user, they are a significant stumbling block for a neophyte.
I.R.I.S. Powerscan is fairly straightforward to set up for anyone that has used batch scanning software before. It is divided into projects, each of which can contain a number of batches that share options such as OCR zones, index fields, and data sources. The database link is set-up through an ODBC data source, although the auto-fill set-up is a bit counterintuitive, as it points to the validation data rather than the actual data to be filled. Additional unique options include automatic line removal and identification of file types based on page fingerprints.
Kodak Capture Pro
Kodak Capture Pro is another middle of the road software that is not difficult to set up for a somewhat experienced user. After setting up a job, multiple different batches can be scanned with that setup. Although it lacks some of the bells and whistles that other software packages have, it also does not require as much work to set up simple automated indexing jobs such as the one being tested.
Kofax Express has an interesting layout in that almost everything is done through the Microsoft ribbon toolbar. The toolbar is divided into numerous tabs, half of which are shared across a chosen job and the other half apply only to the batch that is currently being processed. Although the toolbar is not the most elegant or intuitive method for setting up a job that might require numerous different options, it is nevertheless simple to use. The software is fairly limited in its indexing abilities, particularly in the number of available index fields. Database validation also only allows for a single field.
Diamond Vision is designed as a modular multi-user system, allowing significant customizability in regards to user authorization and workflow, but lacking in elegance and ease of use for basic indexing tasks. Each batch is created under a profile, which includes an assortment of different modules, each with its own options to setup. The modular approach requires some getting used to, as does the OCR setup, but it allows for a more professional, enterprise-level workflow for a fraction of the cost.
Comparing the Scanning & Indexing Workflow
The goal of the indexing process was to input five index fields with data from the documents as quickly as possible, with as little user input as possible.
SimpleIndex was by far the fastest competitor. With the proper setup, it was able to scan, accurately OCR, index, and export the batch of documents with almost no user input other than loading the paper in the scanner and double-clicking on the configuration file. The dynamic OCR, which allows the pages to be less perfectly aligned to produce accurate and exact OCR results, ensured index accuracy and successful database lookups, preventing time-wasting corrections.
PaperVision Capture Desktop was one of the slowest competitors, as it required numerous corrections for the OCR data, and the user had to manually click the 'Manage & Merge' button to auto-fill the index fields from the database on each page.
I.R.I.S. Powerscan was slower than the average due to the number of OCR mistakes to correct. However, this was compounded by the fact that the database lookup did not populate fields automatically after correcting the OCR field. Instead the values had to be manually selected by the user.
Kodak Capture Pro was similarly slower due to OCR errors, but in addition to the option of dropdown lists, the auto-fill fields were filled in as soon as the validated field upon which they were based had a valid value.
Kofax Express had better than average OCR accuracy, probably due to the image enhancement features of the built-in Virtual ReScan. However it required manually selecting and then deselecting the validation field to initiate the database lookup, requiring more user interaction and slowing down the process.
Diamond Vision was moderately accurate in its OCR, but it was the most cumbersome to use when fixing the mistakes that it did make. Most notably, when an index field is marked as required and is not filled in correctly, the software locks its focus on that index field until it is filled in, preventing the correction of other fields that could populate the required one.
What is the Best Scanning Software?
The conclusion we reached was two-fold. When it came to configuration, Kodak Capture Pro was the easiest and most intuitive of the software that we tested, followed by Kofax Express with its ribbon toolbar. IRISPowerscan was fairly straightforward to set-up as well, with the exception of the unintuitive autofill set up. SimpleIndex had a somewhat steeper learning curve, but it has by far the largest feature set and most versatility. PaperVision Capture Desktop was the least intuitive in its setup, obviously aimed for more advanced users at the cost of simplicity. DiamondVision was not very difficult to setup, but not very intuitive either, as its modular approach was unique.
When it came to indexing tasks, SimpleIndex was the clear winner, in terms of accuracy, speed, and minimum of user input. This was followed by Kofax Express, Kodak Capture Pro, Powerscan, Diamond Vision, and PaperVision Capture Desktop, in that order.
What is the Best Value in Scanning Software?
Although certain software is more inherently powerful and versatile than others, there is also a difference in price and pricing structure among the different software. PaperVision Capture Desktop has a single license that includes all features and scanners. DiamondVision comes in two flavors - the single user Desktop Edition and the networked Enterprise Edition. On the other hand, Kofax Express, Kodak Capture Pro and IRISPowerscan are divided according to scanner speed, with the latter having a number of optional features as well. SimpleIndex has one price regardless of scanner speed or volume, but features like barcode and OCR come as optional modules that can be excluded to reduce the total cost.
When we considered the pricing structure alongside the actual price of the software, we found that Kofax Express, Kodak Capture Pro, and I.R.I.S. Powerscan are the better deal for low speed document scanning, since they include their full feature sets even for low speed scanning at the lowest prices. However, they are soon surpassed by the likes of SimpleIndex, which provides all the features for a high speed scanner at a lesser cost than the high-speed versions of Kofax and Kodak and saves additional money in the time it saves users during processing. It becomes even more economical if you do not need the full feature sets but still prefer high-speed scanning. PaperVision Capture Desktop and DiamondVision stand apart, since they are neither the cheapest nor the most expensive software, they have a limited number of different versions, and they are only preferable for somebody looking for a particular feature set - scripting customization and modular multi-user processing, respectively.
Evaluated and written by: Anton Smirnov