多层循环神经网络在动作识别中的应用

摘要

人体动作识别是目前计算机视觉的一个研究热点。本文在传统双流法的基础上，引入目标识别网络，提出了一种基于多层循环神经网络的人体动作识别算法。该算法利用三维扩张卷积金字塔处理连续视频图像，结合长短期记忆网络，给出了一种能够实时分析人体动作行为的金字塔卷积长短期记忆网络。本文利用NTU RGB D人体动作识别数据库，对五种人体动作，如梳头、坐下、起立、挥手、跌倒等动作进行识别。试验结果表明算法由于采取了扩张卷积，参数量明显降低，在监控视频处理方面具有较好的准确性和实时性。 Human action recognition is a research hotspot of computer vision. In this paper, we introduce an object detection model to typical two-stream network and propose an action recognition model based on multilayer recurrent neural network. Our model uses three-dimensional pyramid dilated convolution network to process serial video images, and combines with Long Short-Term Memory Network to provide a pyramid convolutional Long Short-Term Memory Network that can analyze human actions in real-time. This paper uses five kinds of human actions from NTU RGB D action recognition datasets, such as brush hair, sit down, stand up, hand waving, falling down. The experimental results show that our model has good accuracy and real-time in the aspect of monitoring video processing due to using dilated convolution and obviously reduces parameters.

机译：人体动作识别是目前计算机视觉的一个研究热点。本文在传统双流法的基础上，引入目标识别网络，提出了一种基于多层循环神经网络的人体动作识别算法。该算法利用三维扩张卷积金字塔处理连续视频图像，结合长短期记忆网络，给出了一种能够实时分析人体动作行为的金字塔卷积长短期记忆网络。本文利用NTU RGB D人体动作识别数据库，对五种人体动作，如梳头、坐下、起立、挥手、跌倒等动作进行识别。试验结果表明算法由于采取了扩张卷积，参数量明显降低，在监控视频处理方面具有较好的准确性和实时性。 Human action recognition is a research hotspot of computer vision. In this paper, we introduce an object detection model to typical two-stream network and propose an action recognition model based on multilayer recurrent neural network. Our model uses three-dimensional pyramid dilated convolution network to process serial video images, and combines with Long Short-Term Memory Network to provide a pyramid convolutional Long Short-Term Memory Network that can analyze human actions in real-time. This paper uses five kinds of human actions from NTU RGB D action recognition datasets, such as brush hair, sit down, stand up, hand waving, falling down. The experimental results show that our model has good accuracy and real-time in the aspect of monitoring video processing due to using dilated convolution and obviously reduces parameters.

多层循环神经网络在动作识别中的应用

摘要

著录项

相关主题

期刊订阅