Forensic speaker recognition: A new method based on extracting accent and language information from short utterances

Saleem Sajid; Subhan Fazli; Naseer Noman; Bais Abdul; Imtiaz Ammara

首页> 外文期刊>Digital investigation >Forensic speaker recognition: A new method based on extracting accent and language information from short utterances

【24h】

Forensic speaker recognition: A new method based on extracting accent and language information from short utterances

机译：法医扬声器识别：一种基于简短话语中提取重音和语言信息的新方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new method for Forensic Speaker Recognition (FSR). The new method is based on extracting accent and language information from short utterances. Accent Classification (AC) and Lan-guage Identification (LI) play important role in the identification of people of different groups, communities and origins due to different speaking styles and native languages. In a multilingual society, the forensic experts use AC and LI to reduce search space for suspect recognition to regional and ethnic groups. In this paper, we use different baseline and deep learning methods to automate this process. The baseline methods used are Gaussian Mixture Model-Universal Background Model (GMM-UBM), i-vector and Gaussian Mixture Model-Support Vector Machine (GMM-SVM). The Mel-Frequency Cepstral Coefficients (MFCC) are used as speech features in the baseline methods. The deep learning methods used are Convolutional Neural Network (CNN) and Deep Neural Network (DNN). The recently proposed CNN based methods like VGGVox and GMM-CNN are used. VGGVox and GMM-CNN use speech spectrograms. In case of DNN, x-vectors method is used, which is based on DNN embedding. The experimental results show that GMM-SVM demonstrates better FSR performance compared to GMM-UBM and i-vector methods. Whereas, x-vectors method performs better than GMM-CNN and VGGVox methods. It also performs better than GMM-SVM method. The experimental results show that x-vectors method demonstrates 80.4% FSR accuracy. With AC, it achieves 85.4% accuracy. With LI, its accuracy is 90.2%. Whereas by combining AC and LI it obtains 95.1% accuracy. This shows that the proposed method based on AC and LI gives promising results. (C) 2020 Elsevier Ltd. All rights reserved.

机译：本文介绍了法医扬声器识别（FSR）的新方法。新方法基于从短语中提取重音和语言信息。口音分类（AC）和LAN-GUAGE识别（LI）在识别不同群体，社区和起源的识别中起重要作用，由于不同的说话方式和母语。在多语种社会中，法医专家使用AC和LI减少对区域和族群的嫌疑人的识别搜索空间。在本文中，我们使用不同的基线和深度学习方法来自动化此过程。使用的基线方法是高斯混合模型 - 通用背景模型（GMM-UBM），I形式和高斯混合模型支持向量机（GMM-SVM）。熔融频率谱系数（MFCC）用作基线方法中的语音特征。使用的深度学习方法是卷积神经网络（CNN）和深神经网络（DNN）。最近提出的基于CNN的基于CNN的方法，如VGGVOX和GMM-CNN。 VGGVOX和GMM-CNN使用语音谱图。在DNN的情况下，使用X载体方法，其基于DNN嵌入。实验结果表明，与GMM-UBM和I载体方法相比，GMM-SVM展示了更好的FSR性能。虽然，X-Vectors方法比GMM-CNN和VGGVOX方法更好。它还比GMM-SVM方法更好。实验结果表明，X型载体方法证明了80.4％的FSR精度。通过AC，精度达到85.4％。与李，其准确性为90.2％。而通过组合AC和LI，它获得95.1％的精度。这表明基于AC和LI的提出方法提供了有希望的结果。（c）2020 elestvier有限公司保留所有权利。

著录项

来源
《Digital investigation》 |2020年第9期|300982.1-300982.8|共8页
作者
Saleem Sajid; Subhan Fazli; Naseer Noman; Bais Abdul; Imtiaz Ammara;
展开▼
作者单位

Natl Univ Modern Languages Fac Engn & Comp Sci H-9 Islamabad Pakistan;

Natl Univ Modern Languages Fac Engn & Comp Sci H-9 Islamabad Pakistan;

Air Univ Dept Mechatron Engn Islamabad Pakistan;

Univ Regina Fac Engn & Appl Sci Regina SK Canada;

Natl Univ Modern Languages Fac Engn & Comp Sci H-9 Islamabad Pakistan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Accent; Language; Forensic speaker recognition; Deep learning; Speech features;

机译：口音;语言;法医扬声器识别;深入学习;语音功能;

相似文献

外文文献
中文文献
专利

1. Speaker recognition based on short utterance compensation method of generative adversarial networks [J] . Zhangfang Hu, Yaqin Fu, Yuan Luo, International journal of speech technology . 2020,第2期

机译：基于生成对抗网络短语补偿方法的扬声器识别
2. Extracting accent information from Urdu speech for forensic speaker recognition [J] . Falak TAHIR, Sajid SALEEM, Ayaz AHMAD Turkish Journal of Electrical Engineering and Computer Sciences . 2019,第5期

机译：从URDU演讲中提取重音信息进行法医扬声器识别
3. Speaker specific feature based clustering and its applications in language independent forensic speaker recognition [J] . Satyanand Singh, Pragya Singh International Journal of Electrical and Computer Engineering . 2020,第4期

机译：基于扬声器特定的功能的聚类及其在语言独立法医扬声器识别中的应用
4. Improving Short Utterance based I-vector Speaker Recognition using Source and Utterance-Duration Normalization Techniques [C] . A. Kanagasundaram, D. Dean, J. Gonzalez-Dominguez, Conference of the International Speech Communication Association . 2013

机译：使用源和话语持续时间归一化技术改进基于短语的I - 矢量扬声器识别
5. Accent and speaker recognition for advanced automatic speech recognition. [D] . Angkititrakul, Pongtep. 2004

机译：口音和说话者识别功能可实现高级自动语音识别。
6. Speaker-independent factors affecting the perception of foreign accent in a second language [O] . Susannah V. Levi, Stephen J. Winters, David B. Pisoni -1

机译：与说话者无关的因素会影响第二语言中对外国口音的感知
7. Improving short utterance based I-vector speaker recognition using source and utterance-duration normalization techniques [O] . Kanagasundaram Ahilan, Dean David, Gonzalez-Dominguez Javier, 2013

机译：使用源和话语持续时间归一化技术改进基于短话语的I矢量说话人识别
8. Speaker Recognition from an Unknown Utterance and Speaker-Speech Interaction. [R] . Kashyap, R. L. 1976

机译：来自未知话语和说话者 - 语音交互的说话人识别。

Forensic speaker recognition: A new method based on extracting accent and language information from short utterances

摘要

著录项

相似文献

相关主题

期刊订阅