Publications:A base-line character recognition for syriac-aramaic

Title	A base-line character recognition for syriac-aramaic
Author	Elizabeth Tse and Josef Bigun
Year	2007
PublicationType	Conference Paper
Journal
HostPublication	IEEE International Conference on Systems Man and Cybernetics Conference Proceedings
Conference	IEEE International Conference on Systems, Man and Cybernetics, 7-10 Oct. 2007, Montreal, Que.
DOI	http://dx.doi.org/10.1109/ICSMC.2007.4414012
Diva url	http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:408392
Abstract	Serto is the cursive alphabet of Syriac-Aramaic, which is used by the largest corpus of documents in libraries in Aramaic. A lingua franca, and often a source language, Aramaic has influenced major Judaic, Christian and Islamic thoughts as well as the development of science. The script is cursive, e.g. Arabic, and consequently it has a hand-writing appearance compared to Latin. Serto, and Aramaic in practice, has not an automatic character recognition system, OCR Most library documents are reproductions using printed characters. The readers would strongly benefit from having an OCR, as these reproductions are predominantly books, printed in the pre-computer era. We propose a segmentation-free OCR using linear symmetry features with an individual threshold for the tensors of the characters, and an ordered search sequence. It yields ~ 90 % correctly identified characters in the average. As a first recognition scheme for Serto, it represents a base-line OCR for Syriac-Aramaic.

Do not edit this section