首页> 外文会议>Workshop on Challenges in the Management of Large Corpora;Language Resources and Evaluation Conference >The Corpus Query Middleware of Tomorrow - A Proposal for a Hybrid Corpus Query Architecture
【24h】

The Corpus Query Middleware of Tomorrow - A Proposal for a Hybrid Corpus Query Architecture

机译:明天的语料库查询中间件-混合语料库查询体系结构的建议

获取原文

摘要

Development of dozens of specialized corpus query systems and languages over the past decades has let to a diverse but also fragmented landscape. Today we are faced with a plethora of query tools that each provide unique features, but which are also not interoperable and often rely on very specific database back-ends or formats for storage. This severely hampers usability both for end users that want to query different corpora and also for corpus designers that wish to provide users with an interface for querying and exploration. We propose a hybrid corpus query architecture as a first step to overcoming this issue. It takes the form of a middleware system between user front-ends and optional database or text indexing solutions as back-ends. At its core is a custom query evaluation engine for index-less processing of corpus queries. With a flexible JSON-LD query protocol the approach allows communication with back-end systems to partially solve queries and offset some of the performance penalties imposed by the custom evaluation engine. This paper outlines the details of our first draft of aforementioned architecture.
机译:在过去的几十年中,数十种专门的语料库查询系统和语言的发展使人们产生了多样化但又零散的局面。今天,我们面临着众多的查询工具,每个查询工具都提供独特的功能,但是它们却不能互操作,并且通常依赖于非常特定的数据库后端或格式进行存储。这对于希望查询不同语料库的最终用户以及希望为用户提供查询和探索界面的语料库设计者都严重地妨碍了可用性。我们提出了一种混合语料库查询架构,作为克服此问题的第一步。它采用了介于用户前端和可选数据库或文本索引解决方案之间作为后端的中间件系统的形式。其核心是用于对语料库查询进行无索引处理的自定义查询评估引擎。通过灵活的JSON-LD查询协议,该方法允许与后端系统进行通信以部分解决查询并抵消自定义评估引擎施加的一些性能损失。本文概述了上述架构的第一稿的细节。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号