Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores

Alexey Sholokhov; Tomi Kinnunen; Ville Vestman; Kong Aik Lee

首页> 外文期刊>Computer speech and language >Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores

【24h】

Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores

机译：语音生物识别技术的安全性：通过说话人验证分数的分级贝叶斯建模推断错误警报率

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

How secure automatic speaker verification (ASV) technology is? More concretely, given a specific target speaker, how likely is it to find another person who gets falsely accepted as that target? This question may be addressed empirically by studying naturally confusable pairs of speakers within a large enough corpus. To this end, one might expect to find at least some speaker pairs that are indistinguishable from each other in terms of ASV. To a certain extent, such aim is mirrored in the standardized ASV evaluation benchmarks, for instance, the series of speaker recognition evaluation (SRE) organized by the National Institute of Standards and Technology (NIST). Nonetheless, arguably the number of speakers in such evaluation benchmarks represents only a small fraction of all possible human voices, making it challenging to extrapolate performance beyond a given corpus. Furthermore, the impostors used in performance evaluation are usually selected randomly. A potentially more meaningful definition of an impostor - at least in the context of security-driven ASV applications - would be closest (most confusable) other speaker to a given target. We put forward a novel performance assessment framework to address both the inadequacy of the random-impostor evaluation model and the size limitation of evaluation corpora by addressing ASV security against closest impostors on arbitrarily large datasets. The framework allows one to make a prediction of the safety of given ASV technology, in its current state, for arbitrarily large speaker database size consisting of virtual (sampled) speakers. As a proof-of-concept, we analyze the performance of two state-of-the-art ASV systems, based on i-vector and x-vector speaker embeddings (as implemented in the popular Kaldi toolkit), on the recent VoxCeleb 1, and 2 corpora, containing a total of 7365 speakers. We fix the number of target speakers to 1000, and generate up to N = 100,000 virtual impostors sampled from the generative model. The model-based false alarm rates are in a reasonable agreement with empirical false alarm rates and, as predicted, increase substantially (values up to 98%) with N = 100,000 impostors. Neither the i-vector or x-vector system is immune to increased false alarm rate at increased impostor database size, as predicted by the model.

机译：自动扬声器验证（ASV）技术的安全性如何？更具体地讲，给定特定的目标讲话者，找到另一个被错误接受为目标的人的可能性有多大？通过研究足够大的语料库中自然易混淆的说话人对，可以凭经验解决这个问题。为此，可能期望找到至少一些在ASV方面彼此无法区分的扬声器对。在某种程度上，该目标反映在标准化的ASV评估基准中，例如，由美国国家标准技术研究院（NIST）组织的一系列说话人识别评估（SRE）。尽管如此，可以说，这种评估基准中的发言人人数只代表了所有可能的人类声音的一小部分，因此很难推断出超出给定语料库的表现。此外，绩效评估中使用的冒名顶替者通常是随机选择的。至少在安全性驱动的ASV应用程序上下文中，对冒名顶替者的潜在更有意义的定义将使其他说话者最接近（最容易混淆）给定目标。我们提出了一种新颖的绩效评估框架，通过针对任意大型数据集上针对最接近冒名顶替者的ASV安全性，来解决随机冒名顶替者评估模型的不足和评估语料库的规模限制。该框架允许对由虚拟（采样）扬声器组成的任意大型扬声器数据库大小在其当前状态下的给定ASV技术的安全性进行预测。作为概念验证，我们在最近的VoxCeleb 1上基于i-vector和x-vector扬声器嵌入（在流行的Kaldi工具包中实现）分析了两个最新的ASV系统的性能。以及2个语料库，总共包含7365位演讲者。我们将目标说话者的数量固定为1000，并从生成模型中采样最多生成N = 100,000个虚拟冒名顶替者。基于模型的错误警报率与经验错误警报率在合理的范围内，并且如预测的那样，在N = 100,000冒名顶替者的情况下，错误警报率显着增加（值高达98％）。正如模型所预测的那样，在冒名顶替者数据库规模增加的情况下，i-vector或x-vector系统都无法避免假警报率的增加。

著录项

来源
《Computer speech and language》 |2020年第3期|101024.1-101024.19|共19页
作者
Alexey Sholokhov; Tomi Kinnunen; Ville Vestman; Kong Aik Lee;
展开▼
作者单位

St. Petersburg Electrotechnical University ul. P. Popova 5 St. Petersburg 197376 Russia ITMO University Kronverkskiy pr. 49 St. Petersburg 197101 Russia School of Computing University of Eastern Finland Joensuu FI-80101 Finland;

School of Computing University of Eastern Finland Joensuu FI-80101 Finland;

Biometrics Research Laboratories NEC Corp. Tokyo Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Speaker verification; Population size; Security; False alarm rate; Random impostor; Closest impostor; Bayesian score modeling; VoxCeleb;

机译：说话者验证;人口规模;安全;误报率;随机冒名顶替者;最接近的冒名顶替者;贝叶斯分数建模;VoxCeleb;

相似文献

外文文献
中文文献
专利

1. Bayesian Networks to Model the Variability of Speaker Verification Scores in Adverse Environments [J] . Jesús Villalba, Antonio Miguel, Alfonso Ortega, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第12期

机译：贝叶斯网络建模不良环境中说话人验证分数的变化
2. Reliable likelihood ratios for statistical model-based voice activity detector with low false-alarm rate [J] . Kim Younggwan, Suh Youngjoo, Kim Hoirin EURASIP journal on advances in signal processing . 2011,第20aPta2期

机译：低误报率的基于统计模型的语音活动检测器的可靠似然比
3. Reliable likelihood ratios for statistical model-based voice activity detector with low false-alarm rate [J] . Younggwan Kim, Youngjoo Suh, Hoirin Kim EURASIP journal on advances in signal processing . 2011,第1期

机译：低误报率的基于统计模型的语音活动检测器的可靠似然比
4. Minimizing the false alarm probability of speaker verification systems for mimicked speech [C] . Kuruvachan K. George, C. Santhosh Kumar, Ashish Panda, International Conference on Computing and Network Communications . 2015

机译：最小化模仿语音的说话人验证系统的误报概率
5. High accuracy ion mobility spectrometry to reduce false alarm rates in national security technology. [D] . Hauck, Brian C. 2016

机译：高精度离子迁移谱仪可降低国家安全技术中的误报率。
6. Improving Speaker Recognition by Biometric Voice Deconstruction [O] . Luis Miguel Mazaira-Fernandez, Agustín Álvarez-Marquina, Pedro Gómez-Vilda 2015

机译：通过生物识别语音解构改善说话人识别
7. Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores [O] . Alexey Sholokhov, Tomi Kinnunen, Ville Vestman, 2020

机译：语音生物学测安全性：通过分层贝叶斯型扬声器验证分数推断出误报率
8. Estimation of False Alarm Probabilities in Cell Averaging Constant False Alarm Rate Detectors via Monte Carlo Methods [R] . Weinberg, G. V. 2004

机译：用蒙特卡罗方法估计小区平均恒虚警率检测器中的虚警概率

Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores

摘要

著录项

相似文献

相关主题

期刊订阅