Paper Title
Spell Detection and Correction in Independent System (Punjabi and Hindi)
Abstract
Various systems are available for spell checking in Punjabi language. There is shortage of a system having Hindi
spell correction. This paper presents a system which is made to check the spellings and to correct them using various
techniques for both Punjabi text and Hindi text. In this proposed system, input is given in form of a paragraph that can have
incorrect words and the system will generate accurate text after eliminating the errors. The system uses hybrid approach to
implement the mis-spelled detection and Correction System. This hybrid approach is a combination of “database approach”,
“modified rule based approach”, “Statistical Machine Approach up to n grams”, “Edit Distance approach” and use linguistic
features of both Punjabi language and Hindi language. This system will detect and correct both Typographic and Cognitive
types of errors. Corpora are essential for this. To develop the corpus of various Punjabi word entities which include the
names of males, females, countries, locations, states, rivers, places, grammatical words from dictionary of Punjabi and also
corpus is created for Hindi word entities which include the names of males, females, countries, locations , states, rivers,
places, grammatical words from dictionary of Hindi. The corpus is created by using algorithm. The paragraph will be given
and the system will give two options, whether that data ought to be inserted in corpora or directly you want to move for spell
checking. The corpus tables in Punjabi and Hindi are linked. If we will insert Punjabi text then it will be inserted into
tbpbidict and if hindi paragraph is given, it will be inserted into tbhindict. The proposed system works as the language
detector also.
Keywords - Typographic Error, Cognitive Error, Statistical Machine Approach