首页> 外文会议>International Conference on intelligent science and big data engineering >Real-Time Visual Object Tracking Based on Reinforcement Learning with Twin Delayed Deep Deterministic Algorithm
【24h】

Real-Time Visual Object Tracking Based on Reinforcement Learning with Twin Delayed Deep Deterministic Algorithm

机译:基于加固学习的实时视觉对象跟踪与双延迟深度决定性算法

获取原文

摘要

Object tracking as a low-level vision task has always been a hot topic in computer vision. It is well known that Challenges such as background clutters, fast object motion and occlusion el al. affect a lot the robustness or accuracy of existing object tracking methods. This paper proposes a reinforcement learning model based on Twin Delayed Deep Deterministic algorithm (TD3) for single object tracking. The model is based on the deep reinforcement learning model, Actor-Critic (AC), in which the Actor network predicts a continuous action that moves the target bounding box in the previous frame to the object position in the current frame and adapts to the object size. The Critic network evaluates the confidence of the new bounding box online to determine whether the Critic model needs to be updated or re-initialized. In further, in our model we use TD3 algorithm to further optimize the AC model by using two Critic networks to jointly predict the bounding box confidence, and to obtain the smaller predicted value as the label to update the network parameters, thereby rendering the Critic network to avoid excessive estimation bias, accelerate the convergence of the loss function, and obtain more accurate prediction values. Also, a small amount of random noise with upper and lower bounds are added to the action in the Actor model, and the search area is reasonably expanded in offline learning to improve the robustness of the tracking method under strong background interference and fast object motion. The Critic model can also guide the Actor model to select the best action and continuously update the state of the tracking object. Comprehensive experimental results on the OTB-2013 and OTB-2015 benchmarks demonstrate that our tracker performs best in precision, robustness, and efficiency when compared with state-of-the-art methods.
机译:作为低级视觉任务的对象跟踪一直是计算机视觉中的热门话题。众所周知,背景夹斗,快速物体运动和遮挡EL Al等挑战。影响现有对象跟踪方法的鲁棒性或准确性。本文提出了一种基于双延迟深度确定性算法(TD3)的加强学习模型,用于单对象跟踪。该模型基于深度加强学习模型,演员 - 录制仪(AC),其中Actor网络预测了将前一帧中的目标边界框移动到当前帧中的对象位置的连续动作,并适应对象尺寸。批评网络评估新边界框在线的信心确定是否需要更新或重新初始化批评模型。进一步地,在我们的模型中,我们使用TD3算法通过使用两个批评网络来进一步优化AC模型,以共同预测边界框的置信度,并获得更新的预测值作为标签以更新网络参数,从而呈现批评网络为避免过度估计偏差,加速损耗功能的收敛,并获得更准确的预测值。而且,将具有上限和下部的少量随机噪声添加到actor模型中的动作中,并且在离线学习中合理地扩展搜索区域,以提高在强大的背景干扰和快速对象运动下跟踪方法的鲁棒性。批评模型还可以指导演员模型来选择最佳动作并连续更新跟踪对象的状态。 OTB-2013和OTB-2015基准测试的综合实验结果表明,与最先进的方法相比,我们的跟踪器在精度,鲁棒性和效率中表现最佳。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号