首页>
外国专利>
Aligning symbols and objects using co-attention for understanding visual content
Aligning symbols and objects using co-attention for understanding visual content
展开▼
机译:使用共同关注对齐符号和对象以了解视觉内容
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method, apparatus and system for understanding visual content includes determining at least one region proposal for an image, attending at least one symbol of the proposed image region, attending a portion of the proposed image region using information regarding the attended symbol, extracting appearance features of the attended portion of the proposed image region, fusing the appearance features of the attended image region and features of the attended symbol, projecting the fused features into a semantic embedding space having been trained using fused attended appearance features and attended symbol features of images having known descriptive messages, computing a similarity measure between the projected, fused features and fused attended appearance features and attended symbol features embedded in the semantic embedding space having at least one associated descriptive message and predicting a descriptive message for an image associated with the projected, fused features.
展开▼