首页> 美国卫生研究院文献>Data in Brief >Event-Dataset: Temporal information retrieval and text classification dataset

【2h】

Event-Dataset: Temporal information retrieval and text classification dataset

机译：事件数据集：时间信息检索和文本分类数据集

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, Temporal Information Retrieval (TIR) has grabbed the major attention of the information retrieval community. TIR exploits the temporal dynamics in the information retrieval process and harnesses both textual relevance and temporal relevance to fulfill the temporal information requirements of a user Ur Rehman Khan et al., 2018. The focus time of document is an important temporal aspect which is defined as the time to which the content of the document refers Jatowt et al., 2015; Jatowt et al., 2013; Morbidoni et al., 2018, Khan et al., 2018. To the best of our knowledge, there does not exist any standard benchmark data set (publicly available) that holds the potential to comprehensively evaluate the performance of focus time assessment strategies. Considering these aspects, we have produced the Event-dataset, which is comprised of 35 queries and set of news articles for each query. Such that,

C = {Q_{s}, D_{s}},

where C represents the dataset,

Q_{s}

is query set

Q_{s} = {q_{1}, q_{2}, q_{3}, \dots \dots ., q_{35}}

and for each

q_{i}

there is a set of news articles

q_{i} = {d_{r}, d_{n r}}

d_{r}, d_{n r}

are sets of relevant documents and non-relevant documents respectively. Each query in the dataset represents a popular event. To annotate these articles into relevant and non-relevant, we have employed a user-study based evaluation method wherein a group of postgraduate students manually annotate the articles into the aforementioned categories. We believe that the generation of such dataset can provide an opportunity for the information retrieval researchers to use it as a benchmark to evaluate focus time assessment methods specifically and information retrieval methods generically.

机译：最近，时间信息检索（TIR）引起了信息检索社区的广泛关注。 TIR在信息检索过程中利用了时间动态，并利用文本相关性和时间相关性来满足用户的时间信息要求Ur Rehman Khan等人，2018。文档的焦点时间是重要的时间方面，其定义为文档内容引用Jatowt等人的时间，2015年; Jatowt等，2013； Morbidoni et al。，2018，Khan et al。，2018.据我们所知，没有任何标准的基准数据集（可公开获得）具有全面评估聚焦时间评估策略性能的潜力。考虑到这些方面，我们生成了事件数据集，该数据集由35个查询和每个查询的新闻文章集组成。这样，

C={Qs，

著录项

期刊名称 Data in Brief
作者
Shafiq Ur Rehman Khan; Muhammad Arshad Islam;
展开▼
作者单位

展开▼
年(卷),期 2019(25),-1
年度 2019
页码 104048
总页数 8
原文格式 PDF
正文语种
中图分类
关键词
Information retrieval Temporal Text classification Focus time assessment;

机译：信息检索;时间;文本分类;焦点时间评估;

相似文献

外文文献
中文文献
专利

1. Event-Dataset: Temporal information retrieval and text classification dataset [J] . Shafiq Ur Rehman Khan, Muhammad Arshad Islam Data in Brief . 2019,第2期

机译：Event-DataSet：时间信息检索和文本分类数据集
2. Temporal specificity-based text classification for information retrieval [J] . SHAFIQ UR REHMAN KHAN, MUHAMMD ARSHAD ISLAM, MUHAMMAD ALEEM, Turkish Journal of Electrical Engineering and Computer Sciences . 2018,第6期

机译：基于时间特异性的文本分类信息检索
3. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods [J] . Kou Gang, Yang Pei, Peng Yi, Applied Soft Computing . 2020,第期

机译：使用多种标准决策方法对小型数据集的文本分类特征选择方法的评估
4. Text Retrieval Using SMS Queries: Datasets and Overview of FIRE 2011 Track on SMS-Based FAQ Retrieval [C] . Danish Contractor, L. Venkata Subramaniam, Deepak P., Forum for Information Retrieval Evaluation . 2013

机译：使用SMS查询的文本检索：基于SMS的FAIQ Roverival 2011 Track的数据集和概述
5. Relevantnost informacijskega priklica pri strojnem u?enju za binarno besedilno klasifikacijo =Relevance of Information Retrieval in Machine Learning Binary Text Classification [D] . Marijan, Robert. 2020

机译：信息检索的相关性当机器学习的二进制文本分类时=信息检索和机器学习二进制文本分类的相关性
6. Leveraging word embeddings and medical entity extraction for biomedical dataset retrieval using unstructured texts [O] . Yanshan Wang, Majid Rastegar-Mojarad, Ravikumar Komandur-Elayavilli, 2017

机译：利用单词嵌入和医学实体提取来使用非结构化文本检索生物医学数据集
7. Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach [O] . Wenpeng Yin, Jamaal Hay, Dan Roth 2019

机译：基准测试零拍文本分类：数据集，评估和征集方法

Event-Dataset: Temporal information retrieval and text classification dataset

摘要

著录项

相似文献

相关主题

期刊订阅