Friday, May 12, 2006

Searching the text of 300 digitized titles

We're running PrimeRecognition's PrimeOCR for text conversion. I've spent several months working out kinks and learning about how OCR really works (or doesn't work) with our historic literature. We have a cache of 83,000 text pages generated from Prime and I wondered how easy it would be to drop in existing services within our network to start interacting with this text. Turns out it was REALLY easy. We're a Windows/.NET shop, so we have several machines running IIS 6.0. I built an out-of-the-box Indexing Service implementation and incorporated it into our beta site at:

http://www.botanicus.org/search.asp

Give it a try - results are interesting!

0 Comments:

Post a Comment

<< Home