Template:Tamil Optical Character Recognition Support Project

From Noolaham Foundation
Revision as of 05:42, 31 January 2015 by Gopi (talk | contribs) (Created page with "The ‘Tamil Optical Character Recognition Support Project’ was aimed at providing the scanned raw images of rare Tamil documents to assist Tamil OCR development project. Noola...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The ‘Tamil Optical Character Recognition Support Project’ was aimed at providing the scanned raw images of rare Tamil documents to assist Tamil OCR development project. Noolaham Foundation collaborated with department of Computer Science, University of Jaffna to implement this project. Tamil digitization project is a joint venture of the Theekshana (school of computing, University of Colombo) and the department of Computer Science, University of Jaffna and funded by ICTA. The main goal of the project is to develop the tools needed to automatically recognize the most common printed Tamil fonts from scanned images of books and documents for digitizing such content.

Noolaham Foundation provided scanned images of 51 rare documents raw images to department of Computer Science, university of Jaffna for training and testing the Tamil OCR system through this project. In the provided documents, 15 books (6450 raw images) were already available at Noolaham Digital Archive. Another 36 documents were digitized specifically for this project to use them for training and testing the Tamil OCR system which is being developing. All 51 documents are made available online through Noolaham Foundation’s Digital Library www.noolaham.org. This project was a successful initiative and received special appreciation from the research community.