Basic Scanning Versus Advanced Capture For SharePoint

March 11, 2013

It’s no secret that Microsoft SharePoint is a comprehensive platform for content management, team collaboration, process automation and a whole lot more. There is a lot of media focus on all the things you can do with SharePoint, and how to go about accomplishing those things. But, there are still a couple of areas where SharePoint lacks certain capabilities. Of course, another factor in the success of the product is the rich ‘ecosystem’ that has grown up around SharePoint. Microsoft partners have stepped up to fill the void in those key areas, by adding functionality that integrates with and extends SharePoint’s native features.

One such area is the conversion of paper documents to a digital format. This is generally referred to as ‘document scanning,’ though there are several terms that mean (almost) the same thing. While it’s true that Microsoft Office has a rudimentary ability to scan documents and even perform Optical Character Recognition (OCR) on images to convert them to text, it isn’t well-suited for much more than a single document at a time. There are also a large number of ‘scan to PDF’ programs available in the marketplace, most are inexpensive, or even included with scanners and multi-function devices with scanning capabilities. Most of them do a good job of taking the output of a scanning device and saving it in a digital format that can be archived in an electronic file repository like SharePoint. These are often referred to as ‘basic scanning solutions.’

But what if you have to process more than a few documents at a time? What if you have a large collection of paper archives that you want to migrate into your powerful new SharePoint document libraries? Scanning documents one-at-a-time, saving them to SharePoint manually and indexing (or tagging) them individually can become a time-consuming (and therefore expensive) proposition.

Enter ‘Advanced Capture with Automatic Data Extraction’ solutions. Products that fall into this category offer far more than basic scanning capabilities. Such solutions can process the basic scanned image and apply OCR to turn the image into searchable text, but that’s only the beginning. From there, the best products of this type can interpret the image, apply rules-based search-and-replace and data lookup functions, and actually extract relevant data from the image. This information can be used to intelligently determine where each individual document starts and ends, when a stack of paper is scanned. Such solutions can figure out what the document is and where to put it in the repository, based on the extracted data. It can even assign index attributes (aka metadata) to each document as is it loaded into the appropriate SharePoint site/library/folder, automatically. Add to that the ability to interpret bar codes, marks such as checked boxes and ‘scan-tron’ bubbles and even handwriting recognition, and you have a powerful way to process large numbers of pages with little to no human intervention required.

SWC has had the opportunity to work with ‘advanced capture solutions’ from several vendors. Some can be quite expensive, but they also add far more than capture capabilities to SharePoint’s base functionality. We have found that the PSI:Capture suite of products from PSIGEN Software offers industry-leading advanced capture capabilities, for a price that is very attractive to SWC’s typical mid-market customers. PSI:Capture is focused exclusively on providing a ‘swiss army knife’ of document scanning and capture features and functionality, for a fraction of the price of its competitors.

