Self-Supervised Synonym Extraction from the Web

Hu Fanghuai; Shao Zhiqing; Ruan Tong

首页> 外文期刊>Journal of Information Recording >Self-Supervised Synonym Extraction from the Web

【24h】

Self-Supervised Synonym Extraction from the Web

机译：从网上进行自我监督的同义词提取

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current synonym extraction methods work in a "closed" way. Given the problem word and set of target words, researchers have to choose words synonymous with the problem word using features such as lexical patterns and distributional similarities. This paper tries to discover synonyms in an "open" way and presents a synonym extraction framework based on self-supervised learning. We first analysis the nature of the open method and argue that a trained pattern-independent model for synonym extraction is feasible. We then model the extraction of synonyms from sentences as a sequential labeling problem and automatically generate labeled training samples by using structured knowledge from online encyclopedias and some generic heuristic rules. Finally, we train some Conditional Random Field (CRF) models and use them to extract synonyms from the web. We successfully extract more than 20 million facts, which contain 826,219 distinct pairs of synonyms.

机译：当前的同义词提取方法以“封闭”方式工作。给定问题词和目标词集，研究人员必须使用词汇模式和分布相似性等特征选择与问题词同义的词。本文试图以“开放”的方式发现同义词，并提出了一种基于自我监督学习的同义词提取框架。我们首先分析了开放方法的性质，并认为一种经过训练的与模式无关的同义词提取模型是可行的。然后，我们将句子中同义词的提取建模为顺序标签问题，并使用在线百科全书中的结构化知识和一些通用启发式规则自动生成标签训练样本。最后，我们训练了一些条件随机场（CRF）模型，并使用它们从网络中提取同义词。我们成功地提取了超过2000万个事实，其中包含826,219对不同的同义词。

著录项

来源
《Journal of Information Recording》 |2015年第3期|1133-1148|共16页
作者
Hu Fanghuai; Shao Zhiqing; Ruan Tong;
展开▼
作者单位

E China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China;

E China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China;

E China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
synonym extraction; self-supervised learning; sequential labeling; pattern; encyclopedia;

机译：同义词提取自主学习顺序标注模式百科全书;

相似文献

外文文献
中文文献
专利

1. Self-supervised relation extraction from the Web [J] . Benjamin Rozenfeld, Ronen Feldman Knowledge and information systems . 2008,第1期

机译：从Web进行自我监督的关系提取
2. Self-supervised relation extraction from the Web [J] . Benjamin Rozenfeld, Ronen Feldman Knowledge and Information Systems . 2008,第1期

机译：从Web进行自我监督的关系提取
3. Multi-Distribution Characteristics Based Chinese Entity Synonym Extraction from The Web [J] . Xiangfeng Luo, Xiuxia Ma, Subin Huang, International Journal of Intelligent Information Technologies . 2019,第3期

机译：基于多分发特征的Web中的中国实体同义词提取
4. An automatic extraction method based on synonym dictionary for web reptile question and answer [C] . LinHua Gao, HePing Chen IEEE Conference on Industrial Electronics and Applications . 2018

机译：基于同义词词典的网络爬行动物问答自动提取方法
5. Improving Web Security by Automated Extraction of Web Application Intent. [D] . Bisht, Prithvi Pal Singh. 2011

机译：通过自动提取Web应用程序意图来提高Web安全性。
6. Synonym extraction and abbreviation expansion with ensembles of semantic spaces [O] . Aron Henriksson, Hans Moen, Maria Skeppstedt, 2014

机译：具有语义空间集合的同义词提取和缩写扩展
7. Self-supervised automated wrapper generation for weblog data extraction [O] . Gkotsis, George, Stepanyan, Karen, Cristea, Alexandra I., 2013

机译：自我监督的自动包装器生成，用于Weblog数据提取

Self-Supervised Synonym Extraction from the Web

摘要

著录项

相似文献

相关主题

期刊订阅