首页>
外国专利>
DEVICE AND METHOD FOR CORRECTING BOTH MIS-SPACING WORDS AND MIS-SPELLED WORDS USING N-GRAM
DEVICE AND METHOD FOR CORRECTING BOTH MIS-SPACING WORDS AND MIS-SPELLED WORDS USING N-GRAM
展开▼
机译:使用n-gram纠正误码词和误码词的装置和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A device and a method for correcting incorrect space and spell of a word at the same time by using a syllable n-gram are provided to correct the incorrect space and spell of the word at the same time by forming a syllable n-gram language model from a corpus excluding an error, extracting a grapheme unit conversion probability and a syllable conversion pattern, generating a grapheme and syllable unit candidate for the corpus to be corrected, and finding an optimal path with the formed language model. A syllable n-gram database builder(10) builds a syllable n-gram database(S2) by extracting a syllable n-gram from a refined corpus database(S1). A grapheme unit/syllable conversion database builder(20) builds a grapheme unit conversion probability database(S5) and a syllable conversion pattern database(S6) by extracting a grapheme unit conversion probability and a syllable conversion pattern from an error-included corpus(S3) and a corrected corpus(S4). A grapheme dividing/candidate generating part(30) generates a candidate by separating an input sentence into graphemes and searching the grapheme from the grapheme unit conversion probability database and the syllable conversion pattern database. An optimal path estimator(40) estimates an optimal path for the generated candidate by using output of the syllable n-gram database.
展开▼