Optimized Active Learning Strategy for Audiovisual Speaker Recognition

机译：视听说话人识别的优化主动学习策略

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The purpose of this work is to investigate the improved recognition accuracy caused from exploiting optimization stages for tuning parameters of an Active Learning (AL) classifier. Since plenty of data could be available during Speaker Recognition (SR) tasks, the AL concept, which incorporates human entities inside its learning kernel for exploring hidden insights into unlabeled data, seems extremely suitable, without demanding much expertise on behalf of the human factor. Six datasets containing 8 and 16 speakers' utterances under different recording setups, are described by audiovisual features and evaluated through the time-efficient Uncertainty Sampling query strategy (UncS). Both Support Vector Machines (SVMs) and Random Forest (RF) algorithms were selected to be tuned over a small subset of the initial training data and then applied iteratively for mining the most suitable instances from a corresponding pool of unlabeled instances. Useful conclusions are drawn concerning the values of the selected parameters, allowing future optimization attempts to get employed into more restricted regions, while remarkable improvements rates were obtained using an ideal annotator.

机译：这项工作的目的是研究由于利用优化阶段来调整主动学习（AL）分类器的参数而导致的提高的识别准确性。由于在说话人识别（SR）任务期间可以获取大量数据，因此AL概念将人类实体整合到其学习内核中，以探索对未标记数据的隐藏见解，这似乎非常适合，不需要代表人为因素的大量专业知识。视听功能描述了六个数据集，这些数据集包含不同录制设置下的8和16个说话者的话语，并通过省时的不确定性采样查询策略（UncS）进行了评估。支持向量机（SVM）和随机森林（RF）算法均被选择在初始训练数据的一小部分上进行调整，然后迭代应用以从相应的未标记实例池中挖掘最合适的实例。关于所选参数的值，得出了有用的结论，允许将来的优化尝试被应用到更多的受限区域中，同时使用理想的注释器可以获得显着的改善率。

著录项

来源
《International Conference on speech and computer》|2018年|281-290|共10页
会议地点
作者
Stamatis Karlos; Konstantinos Kaleris; Nikos Fazakis; Vasileios G. Kanas; Sotiris Kotsiantis;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Active Learning; Optimized learner; Speaker Recognition Audiovisual features; Support Vector Machines; Hyperopt package tool;

机译：主动学习;优化的学习者;说话人识别视听功能;支持向量机; Hyperopt打包工具;

相似文献

外文文献
中文文献
专利

1. Hybridized estimations of support vector machine free parameters C and gamma using a fuzzy learning strategy for microphone array-based speaker recognition in a Kinect sensor-deployed environment [J] . Ding Ing-Jr, Shi Jia-Yi Multimedia Tools and Applications . 2017,第23期

机译：Kinect传感器部署环境中基于麦克风阵列的说话人识别的模糊学习策略的支持向量机自由参数C和γ的混合估计
2. A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition [J] . Borgstrom B.J., Alwan A. IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans . 2008,第6期

机译：具有说话人归一化功能的低复杂度抛物线形嘴唇轮廓模型，用于噪声鲁棒的视听语音识别中的高级特征提取
3. Phonetically optimized speaker modeling for robust speaker recognition [J] . Bong-Jin Lee, Jeung-Yoon Choi, Hong-Goo Kang The Journal of the Acoustical Society of America . 2009,第3期

机译：通过语音优化的说话人建模，可实现可靠的说话人识别
4. Optimized Active Learning Strategy for Audiovisual Speaker Recognition [C] . Stamatis Karlos, Konstantinos Kaleris, Nikos Fazakis, International Conference on Speech and Computer . 2018

机译：用于视听扬声器识别的优化积极学习策略
5. Language learning strategy use and proficiency: The relationship between patterns of reported language learning strategy (LLS) use by speakers of other languages (SOL) and proficiency with implications for the teaching/learning situation. [D] . Griffiths, Carol. 2003

机译：语言学习策略的使用和熟练程度：其他语言（SOL）的讲者使用的报告语言学习策略（LLS）的模式与熟练程度之间的关系，这对教学/学习情况具有影响。
6. Audiovisual perceptual learning with multiple speakers [O] . Aaron D. Mitchel, Chip Gerfen, Daniel J. Weiss -1

机译：多个说话人的视听感知学习
7. Integration strategies for audiovisual speech processing: Applied to text-dependent speaker recognition [O] . Simon Lucey, Tsuhan Chen, Sridha Sridharan, 2005

机译：视听语音处理的集成策略：应用于依赖于文本的说话人识别

Optimized Active Learning Strategy for Audiovisual Speaker Recognition

摘要

著录项

相似文献

相关主题

期刊订阅