Bounded-Depth High-Coverage Search Space for Noncrossing Parses

机译：非交叉解析的有界深度高覆盖搜索空间

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A recently proposed encoding for non-crossing digraphs can be used to implement generic inference over families of these digraphs and to carry out first-order factored dependency parsing. It is now shown that the recent proposal can be substantially streamlined without information loss. The improved encoding is less dependent on hierarchical processing and it gives rise to a high-coverage bounded-depth approximation of the space of non-crossing digraphs. This subset is presented elegantly by a finite-state machine that recognizes an infinite set of encoded graphs. The set includes more than 99.99% of the 0.6 million noncrossing graphs obtained from the UDv2 treebanks through planari-sation. Rather than taking the low probability of the residual as a flat rate, it can be modelled with a joint probability distribution that is factorised into two underlying stochastic processes - the sentence length distribution and the related conditional distribution for deep nesting. This model points out that deep nesting in the streamlined code requires extreme sentence lengths. High depth is categorically out in common sentence lengths but emerges slowly at infrequent lengths that prompt further inquiry.

机译：最近提出的非交叉图的编码可用于对这些图的族实施通用推断，并进行一阶分解因式解析。现在表明，可以大大简化最近的提议，而不会造成信息丢失。改进的编码较少地依赖于分级处理，并且它引起了非交叉有向图的空间的高覆盖边界深度近似。该子集由可识别无限组编码图的有限状态机优雅地呈现。该集合包括通过平面化从UDv2树库获得的60万张非交叉图中的99.99％以上。可以将联合概率分布建模为两个潜在的随机过程-句子长度分布和深层嵌套的相关条件分布，而不是将残差的低概率作为统一费率进行建模。该模型指出，精简代码中的深层嵌套需要极长的句子长度。高深度在常见的句子长度中绝对不存在，但在不常出现的长度中会缓慢出现，从而提示进一步的查询。

著录项

来源
《International workshop on finite state methods and natural language processing》|2017年|30-40|共11页
会议地点
作者
Anssi Yli-Jyrae;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. XTandem Parser: An open-source library to parse and analyse X!Tandem MS/MS search results [J] . MuthT., VaudelM., BarsnesH., Proteomics . 2010,第7期

机译：XTandem解析器：用于解析和分析X！Tandem MS / MS搜索结果的开源库
2. OMSSA Parser: An open-source library to parse and extract data from OMSSA MS/MS search results [J] . Proteomics . 2009,第14期

机译：OMSSA Parser：一个开放源代码库，用于从OMSSA MS / MS搜索结果中解析和提取数据
3. Undesired Parking Spaces and Contractible Pieces of the Noncrossing Partition Link [J] . Michael Dougherty, Jon McCammond Electronic Journal Of Combinatorics . 2018,第1期

机译：非期望的停车位和非交叉分区链接的可收缩部分
4. Bounded-Depth High-Coverage Search Space for Noncrossing Parses [C] . Anssi Yli-Jyrae International workshop on finite state methods and natural language processing . 2017

机译：非交叉解析的有界深度高覆盖搜索空间
5. Efficient non-detministic search in structured prediction: A case study on syntactic parsing. [D] . Jiang, Jiarong. 2014

机译：结构化预测中的高效非确定性搜索：以句法分析为例。
6. Shetti a simple tool to parse manipulate and search large dataset of sequences [O] . Haitham Sobhy 2015

机译：Shetti一种简单的工具可以分析操纵和搜索大型序列数据集
7. Bounded-Depth High-Coverage Search Space for Noncrossing Parses [O] . Anssi Yli-Jyrä 2017

机译：非交叉解析的有界深度高覆盖搜索空间

Bounded-Depth High-Coverage Search Space for Noncrossing Parses

摘要

著录项

相似文献

相关主题

期刊订阅