首页> 外文期刊>Journal of Intelligent Systems >Log Posterior Approach in Learning Rules Generated using N-Gram based Edit distance for Keyword Search
【24h】

Log Posterior Approach in Learning Rules Generated using N-Gram based Edit distance for Keyword Search

机译:在使用基于n-gram的编辑距离生成的学习规则中的日志后方法进行关键字搜索

获取原文
获取原文并翻译 | 示例
           

摘要

Challenging searching mechanisms are required to cater to the needs of search engine users in probing the voluminous web database. Searching the query matching keyword based on a probabilistic approach is attractive in most of the application areas, viz. spell checking and data cleaning, because it allows approximate search. A probabilistic approach with maximum likelihood estimation is used to handle real-world problems; however, it suffers from overfitting data. In this paper, a rule-based approach is presented for keyword searching. The process consists of two phases called the rule generation phase and the learning phase. The rule generation phase uses a new technique called N-Gram based Edit distance (NGE) to generate the rule dictionary. The Turing machine model is implemented to describe the rule generation using the NGE technique. In the learning phase, a log model with maximum-a-posterior estimation is used to select the best rule. When evaluated in real time, our system produces the best result in terms of efficiency and accuracy.
机译:需要具有挑战性的搜索机制来迎合搜索引擎用户在探测大量Web数据库中的需求。根据概率方法搜索查询匹配关键字在大多数应用领域,viz是有吸引力的。拼写检查和数据清洁,因为它允许近似搜索。具有最大似然估计的概率方法用于处理现实世界问题;但是,它受到过度装备的数据。在本文中,呈现了一种基于规则的方法,用于关键字搜索。该过程由两个阶段组成,称为规则生成阶段和学习阶段。规则生成阶段使用一种名为N-GRAM的编辑距离(NGE)的新技术来生成规则字典。实现图灵机模型以描述使用NGE技术的规则生成。在学习阶段,使用具有最大-A-后估计的日志模型来选择最佳规则。当实时评估时,我们的系统在效率和准确性方面产生最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号