Clustering search engine suggests by integrating a topic model and word embeddings

机译：聚类搜索引擎通过集成主题模型和Word Embeddings来建议

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The background of this paper is the issue of how to overview the knowledge of a given query keyword. Especially, we focus on concerns of those who search for Web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, we collect up to around 1,000 suggests, while many of them are redundant. We cluster redundant search engine suggests based on a topic model. However, one limitation of the topic model based clustering of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained clusters of search engine suggests, this paper further applies the word embedding technique to the Web pages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, we examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into finer-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic clustering of search engine suggests.

机译：本文的背景是如何概述给定查询关键字的知识的问题。特别是，我们专注于使用给定查询关键字搜索网页的人的关注点。通过搜索引擎收集给定查询关键字的网络搜索信息需求。鉴于查询关键字，我们收集到大约1,000个建议，而其中许多则是多余的。我们群集冗余搜索引擎基于主题模型建议。然而，基于主题模型的搜索引擎的群集的一个限制表明，主题的粒度，即搜索引擎的集群表明太粗糙。为了克服搜索引擎的粗粒群集群的问题表明，除了整个日本版本的文本数据之外，本文还将单词嵌入技术应用于主题模型的培训期间使用的网页维基百科。然后，我们检查搜索引擎之间的基于嵌入的相似性的单词建议，并且进一步分类搜索引擎在单个主题中建议基于Word Embeddings的相似性进入更精细的粗大的子主题。评估结果证明，该方法在搜索引擎的副间聚类的任务中表现良好。

著录项

来源
《IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing》|2017年|1 v.|共6页
会议地点
作者
Tian Nie; Yi Ding; Chen Zhao; Youchao Lin; Takehito Utsuro; Yasuhide Kawada;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Search engines; Web pages; Encyclopedias; Electronic publishing; Internet; Training;

机译：搜索引擎;网页;百科全书;电子出版;互联网;培训;

相似文献

外文文献
中文文献
专利

1. A Method of Subtopic Classification of Search Engine Suggests by Integrating a Topic Model and Word Embeddings [J] . Tian Nie, Yi Ding, Chen Zhao, International journal of software innovation . 2018,第3期

机译：主题模型与词嵌入集成的搜索引擎建议子主题分类方法
2. Relational Biterm Topic Model: Short-Text Topic Modeling using Word Embeddings [J] . Li Ximing, Zhang Ang, Li Changchun, The Computer journal . 2019,第3期

机译：关系双项主题模型：使用词嵌入的短文本主题建模
3. Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling [J] . Li Shuangyin, Pan Rong, Luo Haoyu, Knowledge-Based Systems . 2021,第Apra22期

机译：与无监督主题建模的自适应交叉上下文词嵌入Word Polysemy
4. Clustering search engine suggests by integrating a topic model and word embeddings [C] . Tian Nie, Yi Ding, Chen Zhao, IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing . 2017

机译：聚类搜索引擎通过整合主题模型和单词嵌入来提出建议
5. Things and Strings and More: Improving Place Name Disambiguation from Short Texts by Combining Entity Co-Occurrence, Topic Modeling, and Word Embedding [D] . Ju, Yiting. 2017

机译：事物和字符串和更多：通过组合实体共同发生，主题建模和单词嵌入来改善从短文本的歧义
6. Nonparametric Spherical Topic Modeling with Word Embeddings [O] . Kayhan Batmanghelich, Ardavan Saeedi, Karthik Narasimhan, -1

机译：带词嵌入的非参数球形主题建模
7. A word embedding topic model for topic detection and summary in social networks [O] . Lei Shi, Gang Cheng, Shang-ru Xie, 2019

机译：社交网络主题检测和摘要的单词嵌入主题模型

Clustering search engine suggests by integrating a topic model and word embeddings

摘要

著录项

相似文献

相关主题

期刊订阅