首页>
外国专利>
A METHOD FOR TRAINING A CONVOLUTIONAL NEURAL NETWORK FOR IMAGE RECOGNITION USING IMAGE-CONDITIONED MASKED LANGUAGE MODELING
A METHOD FOR TRAINING A CONVOLUTIONAL NEURAL NETWORK FOR IMAGE RECOGNITION USING IMAGE-CONDITIONED MASKED LANGUAGE MODELING
展开▼
机译:一种使用图像调节屏蔽语言建模训练用于图像识别的卷积神经网络的方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
By the following steps, the method and system pre-train a convolutional neural network for image recognition based on masked language modeling: inputting an image into the convolutional neural network; outputting a visual embedding tensor of visual embedding vectors from the convolutional neural network; tokenizing the caption to generate a list of tokens, wherein at least one token has a visual correspondence to the image received by the convolutional neural network; randomly selecting one of the tokens in the list of tokens to be masked; Step - the selected token is considered the ground truth -; Computing hidden representations of tokens using a language model neural network; carefully pooling visual embedding vectors in a visual embedding tensor using the hidden representation of the masked token as a query vector; pooled visual embeddings predicting a masked token by mapping vectors to tokens; determining a predicted loss associated with the masked token; and backpropagating the prediction loss to the convolutional neural network to adjust its parameters.
展开▼