Text matching
To complete page identification by using text matching, you must first complete a full page recognition. You can then search the recognition results for a string that is unique to each page type.
In the TravelDocs application, the first function attempts a full
page recognition and searches for the string Pickup
on
the current page. If the function finds Pickup
, it
assigns the page type Rental_Agreement
. If the function
does not find Pickup
, it fails, and the second function
searches for the string Flight
. If the second function
finds Flight
, it assigns the page type Air_Ticket
.
If it does not find Flight
, the second function fails,
and the third function searches for the string Room
.
If the third function finds Room
, it assigns the
page type Room_Receipt
. If it does not find Room
,
the page remains with the page type Other
.
As with the structure-based techniques, when you identify a page
by using text matching, the page is not matched to a fingerprint.
Therefore, even though recognition zones are available for your application
to locate data during recognition, the zones are not aligned to the
scanned image. After you identify a page with text-matching methods, you
can customize the application to call CreateFields
.
This call locates the recognition zones where they were defined on
the original fingerprint image for that page type. The zone locations
are not adjusted for shifting of the scanned image in the same manner
that Fingerprint matching can adjust locations. However, you can work
around this limitation by using either of two methods: You can crop
and de-skew the image during an image-processing step, or you can
use pattern-match anchors to align the zones.