首页> 外文会议>International workshop on finite state methods and natural language processing >Bounded-Depth High-Coverage Search Space for Noncrossing Parses
【24h】

Bounded-Depth High-Coverage Search Space for Noncrossing Parses

机译:非交叉解析的有界深度高覆盖搜索空间

获取原文

摘要

A recently proposed encoding for non-crossing digraphs can be used to implement generic inference over families of these digraphs and to carry out first-order factored dependency parsing. It is now shown that the recent proposal can be substantially streamlined without information loss. The improved encoding is less dependent on hierarchical processing and it gives rise to a high-coverage bounded-depth approximation of the space of non-crossing digraphs. This subset is presented elegantly by a finite-state machine that recognizes an infinite set of encoded graphs. The set includes more than 99.99% of the 0.6 million noncrossing graphs obtained from the UDv2 treebanks through planari-sation. Rather than taking the low probability of the residual as a flat rate, it can be modelled with a joint probability distribution that is factorised into two underlying stochastic processes - the sentence length distribution and the related conditional distribution for deep nesting. This model points out that deep nesting in the streamlined code requires extreme sentence lengths. High depth is categorically out in common sentence lengths but emerges slowly at infrequent lengths that prompt further inquiry.
机译:最近提出的非交叉图的编码可用于对这些图的族实施通用推断,并进行一阶分解因式解析。现在表明,可以大大简化最近的提议,而不会造成信息丢失。改进的编码较少地依赖于分级处理,并且它引起了非交叉有向图的空间的高覆盖边界深度近似。该子集由可识别无限组编码图的有限状态机优雅地呈现。该集合包括通过平面化从UDv2树库获得的60万张非交叉图中的99.99%以上。可以将联合概率分布建模为两个潜在的随机过程-句子长度分布和深层嵌套的相关条件分布,而不是将残差的低概率作为统一费率进行建模。该模型指出,精简代码中的深层嵌套需要极长的句子长度。高深度在常见的句子长度中绝对不存在,但在不常出现的长度中会缓慢出现,从而提示进一步的查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号