首页> 外国专利> A METHOD FOR TRAINING A CONVOLUTIONAL NEURAL NETWORK FOR IMAGE RECOGNITION USING IMAGE-CONDITIONED MASKED LANGUAGE MODELING

A METHOD FOR TRAINING A CONVOLUTIONAL NEURAL NETWORK FOR IMAGE RECOGNITION USING IMAGE-CONDITIONED MASKED LANGUAGE MODELING

机译:一种使用图像调节屏蔽语言建模训练用于图像识别的卷积神经网络的方法

摘要

By the following steps, the method and system pre-train a convolutional neural network for image recognition based on masked language modeling: inputting an image into the convolutional neural network; outputting a visual embedding tensor of visual embedding vectors from the convolutional neural network; tokenizing the caption to generate a list of tokens, wherein at least one token has a visual correspondence to the image received by the convolutional neural network; randomly selecting one of the tokens in the list of tokens to be masked; Step - the selected token is considered the ground truth -; Computing hidden representations of tokens using a language model neural network; carefully pooling visual embedding vectors in a visual embedding tensor using the hidden representation of the masked token as a query vector; pooled visual embeddings predicting a masked token by mapping vectors to tokens; determining a predicted loss associated with the masked token; and backpropagating the prediction loss to the convolutional neural network to adjust its parameters.
机译:通过以下步骤,基于屏蔽语言建模的方法和系统预先列车用于图像识别的卷积神经网络:将图像输入到卷积神经网络中;从卷积神经网络输出视觉嵌入矢量的视觉嵌入张量;授权标题以生成令牌列表,其中至少一个令牌对由卷积神经网络接收的图像具有视觉对应;随机选择要屏蔽的令牌列表中的一个令牌;步骤 - 所选令牌被认为是基础的真相 - ;使用语言模型神经网络计算令牌的隐藏表示;使用蒙版令牌的隐藏表示仔细汇集视觉嵌入矢量,作为屏蔽令牌作为查询矢量;通过映射向令牌来预测蒙面令牌的汇集可视嵌入式;确定与蒙面令牌相关的预测损失;并将预测丢失的预测丢失恢复到卷积神经网络以调整其参数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号