Human Audition as Statistical Recognition System for Audio Signals

V. E. Antsiperov; Yu. V. Gulyaev; S. A. Nikitov

首页> 外文期刊>Pattern recognition and image analysis: advances in mathematical theory and applications in the USSR >Human Audition as Statistical Recognition System for Audio Signals

【24h】

Human Audition as Statistical Recognition System for Audio Signals

机译：人类听觉作为音频信号的统计识别系统

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, the results of the study concerning modeling the processes of human perception and recognition of audio signals are discussed. Based on the analysis of reliable psychophysical data, a model (conception) is synthesized within the framework of the Neyman-Pearson approach, which is well-known in the theory of statistical decisions. This model allows a universal view of numerous facts and dependences that have been established by now in relation to the auditory system. A sequential recognition procedure was developed around solid data on the structure and functions of the auditory system. Optimization of the recognition procedure revealed that the perceptually meaningful feature set representing an audio signal consists of sampled values of component envelopes taken in certain time instants. These instants depend on the pattern for which the similarity hypothesis is tested. In the general case, the time instants are not equidistant, contrary to the existing speech recognition techniques. We show how this peculiarity is related to the well-known psychophysical Weber-Fechner law. Theoretical study of the recognition procedure is supplemented by the discussion of possible realization issues. It is shown that the realization of the procedure, especially recursive realization, results in fast and efficient numerical algorithms. These algorithms can be naturally realized on structures similar to neural networks. The relation is considered between recursive realization and well-established recognition techniques, such as LCP (PLP) and MFCC.

机译：在本文中，讨论了关于对人类感知和识别音频信号的过程进行建模的研究结果。基于对可靠的心理物理数据的分析，在Neyman-Pearson方法的框架内综合了一个模型（构想），这在统计决策理论中是众所周知的。该模型允许对目前与听觉系统有关的许多事实和依存关系具有统一的看法。围绕听觉系统的结构和功能的可靠数据开发了顺序识别程序。识别程序的优化表明，代表音频信号的在感知上有意义的特征集由在特定时刻获取的分量包络的采样值组成。这些瞬间取决于测试相似性假设的模式。在一般情况下，与现有的语音识别技术相反，时刻不是等距的。我们展示了这种特殊性与众所周知的心理物理学韦伯-费希纳定律之间的关系。对识别过程的理论研究得到了可能实现问题的讨论的补充。结果表明，该过程的实现，尤其是递归实现，导致了快速有效的数值算法。这些算法可以在类似于神经网络的结构上自然实现。在递归实现和公认的识别技术（例如LCP（PLP）和MFCC）之间考虑了这种关系。

著录项

来源
《Pattern recognition and image analysis: advances in mathematical theory and applications in the USSR》 |2000年第3期|共9页
作者
V. E. Antsiperov; Yu. V. Gulyaev; S. A. Nikitov;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Human Audition as Statistical Recognition System for Audio Signals [J] . V. E. Antsiperov, Yu. V. Gulyaev, S. A. Nikitov Pattern recognition and image analysis: advances in mathematical theory and applications in the USSR . 2000,第3期

机译：人类听觉作为音频信号的统计识别系统
2. Real-time human tracking by audio-visual integration for humanoids - integration of active audition and face recognition [J] . Kazuhiro Nakadai, Ken-ichi Hidai, Hiroshi Mizoguchi, 日本ロボット学会誌 . 2003,第5期

机译：人体样器视听集成的实时人类跟踪 - 积极试镜与人物识别的整合
3. Human emotion recognition and analysis in response to audio music using brain signals [J] . Bhatti Adnan Mehmood, Majid Muhammad, Anwar Syed Muhammad, Computers in Human Behavior . 2016,第DECa期

机译：使用大脑信号响应音频音乐的人类情感识别和分析
4. A tunable, nonsubsampled, non-uniform filter bank for multi-band audition and level modification of audio signals [C] . Cassidy, R., Smith, . 2004

机译：可调谐，非二次采样，非均匀滤波器组，用于音频信号的多频带试听和电平修改
5. Context Recognition Methods using Audio Signals for Human-Machine Interaction [D] . Shah, Mohit. 2015

机译：使用音频信号进行人机交互的上下文识别方法
6. A Similarity Analysis of Audio Signal to Develop a Human Activity Recognition Using Similarity Networks [O] . Alejandra García-Hernández, Carlos E. Galván-Tejada, Jorge I. Galván-Tejada, 2017

机译：音频信号的相似性分析以利用相似性网络发展人类活动识别
7. Real-Time Human Tracking by Audio-Visual Integration for Humanoids-Integration of Active Audition and Face Recognition- [O] . Kazuhiro Nakadai, Ken-ichi Hidai, Hiroshi Mizoguchi, 2003

机译：通过视听集成的实时人类跟踪人形型 - 积极试镜与面部识别 -

Human Audition as Statistical Recognition System for Audio Signals

摘要

著录项

相似文献

相关主题

期刊订阅