Arabic optical character recognition software: A review

Faisal Alkhateeb, Iyad Abu Doush, Abdelraoaf Albsoul

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

This paper provides a thorough evaluation of a set of six important Arabic OCR systems available in the market; namely: Abbyy FineReader, Leadtools, Readiris, Sakhr, Tesseract and NovoVerus. We test the OCR systems using a randomly selected images from the well known Arabic Printed Text Image database (250 images from the APTI database) and using a set of 8 images from an Arabic book. The APTI database contains 45.313.600 of both decomposable and non-decomposable word images. In the evaluation, we conduct two tests. The first test is based on usual metrics used in the literature. In the second test, we provide a novel measure for Arabic language, which can be used for other non-Latin languages.

Original languageEnglish
Pages (from-to)763-776
Number of pages14
JournalPattern Recognition and Image Analysis
Volume27
Issue number4
DOIs
StatePublished - 1 Oct 2017

Keywords

  • APTI database
  • Arabic OCR systems
  • error rate
  • evaluation metrics

Fingerprint

Dive into the research topics of 'Arabic optical character recognition software: A review'. Together they form a unique fingerprint.

Cite this