Tuesday, December 19, 2006

14 hours in the life of Curtis's Botanical Magazine v.27-28

We are now caught up with the backlog of volumes waiting to be published, so we're getting accurate statistics for timing, rate of scanning, rate of publishing, etc. Here's a representative timeline for Curtis v.27-28, originally published in 1808, scanned 12/18/06:

December 18
  • 1:20PM - Imaging Technician scans first page of volume on Indus 5002
  • 4:27PM - Imaging Tech scans last page (#255)
  • 8:00PM - PageConvert runs on Indus 5002 & begins creating JP2s
  • 8:21PM - PageConvert finishes creating JP2s & begins copying TIFs & JP2s to SAN
  • 8:45PM - PageConvert finishes copying files and sets Item's Status to 30 ("on server, but not published") in database

December 19
  • 2:00AM - PagePublish runs, adds page-level metadata to Botanicus, sets Item's Status to 40 ("on server & publish ready") & outputs OCR job file
  • 2:03AM - PrimeOCR picks up job file & begins text conversion
  • 3:03AM - PrimeOCR finishes text conversion

So, we went from pulling a (relatively small) book off the shelf to online w/OCR in 14 hours, 8.5hrs of which was downtime between scheduled processes.