I looked into using a couple of the commercial SDK's for some limited OCR of screenshot bitmaps for instance to verify information in a window and to check some image files for certain text content. These were quite expensive and required royalty fees. Then, I recalled that Microsoft Office has built in OCR. It is in Office Tools for version 2003 onward -- in Microsoft Office Document Imaging. START>ALL PROGRAMS>MICROSOFT OFFICE>MICROSOFT OFFICE TOOLS>. The component is called MODI and lives in MDIVWCTL.DLL. It is scriptable!!!
MODI understands TIFF, multi-page TIFF, and BMP. You can search MSDN under MODI for objects and methods.
Here's some example code to open a TIF or BMP file, OCR the first page, and then put the text result into a single string in a message box.
Code: Select all
VBStart
Dim miDoc
Dim miLayout
Dim stringOut
set miDoc=CreateObject("MODI.Document")
' Load an existing TIFF file.
miDoc.Create ("C:\pathname.tif")
' Perform OCR.
'You can change the mousepointer here to an hourglass or something.
miDoc.Images(0).OCR
'Change the mouse back to normal default.
set miLayout = miDoc.Images(0).Layout
stringOut=miLayout.Text
MsgBox(stringOut)
Set miLayout = Nothing
Set miDoc = Nothing
VBEND
MODI also has its own search methods, document viewer, etc. I think Office 2007 and Vista do not install it by default like 2003 does. (I don't own Vista.)