首页> 外国专利> Speech recognition using distance between feature vector of one sequence and line segment connecting feature-variation-end-point vectors in another sequence

Speech recognition using distance between feature vector of one sequence and line segment connecting feature-variation-end-point vectors in another sequence

机译：使用一个序列的特征向量与连接另一序列的特征变化端点向量的线段之间的距离进行语音识别

页面导航

摘要
著录项
相似文献

摘要

A speech recognition apparatus has an analysis section that outputs features of input speech as a time sequence of feature vectors defined for discrete time points corresponding to a processed speech frame. Reference paradigm utterances are converted into a time sequence of standard (reference) feature vectors. The possible continuous variation of standard feature vectors at each point in time is expressed by a line segment, or set of line segments, connecting the feature vectors for the two end points of the "movable" range within which the feature can change, rather than using a larger set of reference vectors as in a conventional multitemplate approach to speech recognition. For example, the continuous range of possible background noise levels in input speech defines a line segment connecting the two feature vectors at the two SNR value limits. A matching apparatus calculates the distance between the input speech feature vector at each time point and the reference line segment endpoints and the perpendicular distance to the reference line segment (where meaningful), for each reference line segment corresponding to that particular time. The distance between each input feature and each standard (reference) feature sequence, represented by its line segment at a given time, is defined as the smallest of these three (or two) computed distance values.

机译：语音识别设备具有分析部分，该分析部分输出输入语音的特征作为针对与处理后的语音帧相对应的离散时间点定义的特征矢量的时间序列。参考范例话语被转换为标准（参考）特征向量的时间序列。每个时间点上标准特征向量的可能连续变化是由一个线段或一组线段表示的，该线段或一组线段连接了“可移动”范围的两个端点的特征矢量，在该端点上，特征可以更改，而不是像在传统的多模板方法中进行语音识别一样，使用更大的参考矢量集。例如，输入语音中可能的背景噪声电平的连续范围定义了一个线段，该线段以两个SNR值限制连接了两个特征向量。匹配设备针对对应于该特定时间的每个参考线段，计算每个时间点的输入语音特征向量与参考线段端点之间的距离，以及到参考线段的垂直距离（在有意义的情况下）。每个输入要素和每个标准（参考）要素序列之间的距离（由给定时间的线段表示）定义为这三个（或两个）计算距离值中的最小值。

著录项

公开/公告号US5953699A

专利类型
公开/公告日1999-09-14

原文格式PDF
申请/专利权人 NEC CORPORATION;
展开▼

申请/专利号US19970959465
发明设计人 KEIZABURO TAKAGI;
展开▼

申请日1997-10-28
分类号G01L3/02;
国家 US
入库时间 2022-08-22 02:07:16

相似文献

专利
外文文献
中文文献