Quick
Search: 
 
advanced search
 GSW Home    GeoRef Home    My GSW Alerts    Contact GSW    About GSW    Journals List    Help 
Geosphere Don't get GSW? Talk to your librarian.
JOURNAL HOME HELP CONTACT PUBLISHER SUBSCRIBE ARCHIVE SEARCH TABLE OF CONTENTS

Geosphere; February 2008; v. 4; no. 1; p. 159-169; DOI: 10.1130/GES00140.1
© Geological Society of America
Right arrow Help viewing high resolution images
Right arrow Return to article
Click on image to view larger version.


Figure 11


Figure 11. Output of scanning page 173 of the Davidson (1886–1888) monograph using optical character recognition software. Compared with the original, there are six mistakes, which are highlighted using yellow shading. None of the mistakes interferes with extensible markup language (XML) parsing, and all but one are easily corrected using the learn function in the optical character recognition (OCR) dictionary. The large area of highlighting outlines an area in which the order of words in the original text has been very slightly altered. This latter feature is due to the presence of separate text boxes on the page, in particular the interaction of the text box containing the figure description text with the main taxonomic description. This feature was not significant for XML parsing of the text of the taxonomic description.





Right arrow Return to article


JOURNAL HOME HELP CONTACT PUBLISHER SUBSCRIBE ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2008 by Geological Society of America