Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos

Xiong Bo; Jain Suyog Dutt; Grauman Kristen

首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos

【24h】

Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos

机译：像素客观性：学习如何在图像和视频中自动分割通用对象

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose an end-to-end learning framework for segmenting generic objects in both images and videos. Given a novel image or video, our approach produces a pixel-level mask for all "object-like" regions-even for object categories never seen during training. We formulate the task as a structured prediction problem of assigning an object/background label to each pixel, implemented using a deep fully convolutional network. When applied to a video, our model further incorporates a motion stream, and the network learns to combine both appearance and motion and attempts to extract all prominent objects whether they are moving or not. Beyond the core model, a second contribution of our approach is how it leverages varying strengths of training annotations. Pixel-level annotations are quite difficult to obtain, yet crucial for training a deep network approach for segmentation. Thus we propose ways to exploit weakly labeled data for learning dense foreground segmentation. For images, we show the value in mixing object category examples with image-level labels together with relatively few images with boundary-level annotations. For video, we show how to bootstrap weakly annotated videos together with the network trained for image segmentation. Through experiments on multiple challenging image and video segmentation benchmarks, our method offers consistently strong results and improves the state-of-the-art for fully automatic segmentation of generic (unseen) objects. In addition, we demonstrate how our approach benefits image retrieval and image retargeting, both of which flourish when given our high-quality foreground maps. Code, models, and videos are at: http://vision.cs.utexas.edu/projects/pixelobjectness/

机译：我们提出了一种端到端学习框架，用于分割图像和视频中的通用对象。对于新颖的图像或视频，我们的方法会为所有“类物体”区域（甚至对于训练过程中从未见过的物体类别）生成像素级蒙版。我们将任务表述为使用深度完全卷积网络实现的将对象/背景标签分配给每个像素的结构化预测问题。当应用于视频时，我们的模型进一步合并了运动流，并且网络学会了结合外观和运动，并尝试提取所有突出的对象，无论它们是否在移动。除了核心模型之外，我们方法的第二个贡献是它如何利用训练注释的各种优势。像素级注释很难获得，但对于训练深度网络分割方法至关重要。因此，我们提出了利用弱标记数据来学习密集前景分割的方法。对于图像，我们显示了将对象类别示例与图像级标签以及相对较少的具有边界级注释的图像混合在一起的价值。对于视频，我们展示了如何引导弱注释视频以及经过训练以进行图像分割的网络。通过在多个具有挑战性的图像和视频分割基准上进行的实验，我们的方法提供了始终如一的强大结果，并改进了对通用（看不见的）对象进行全自动分割的最新技术。此外，我们演示了我们的方法如何有益于图像检索和图像重新定向，当使用高质量前景地图时，这两种方法都会蓬勃发展。代码，模型和视频位于：http://vision.cs.utexas.edu/projects/pixelobjectness/

著录项

来源
《IEEE Transactions on Pattern Analysis and Machine Intelligence》 |2019年第11期|2677-2692|共16页
作者
Xiong Bo; Jain Suyog Dutt; Grauman Kristen;
展开▼
作者单位

Univ Texas Austin Dept Comp Sci Austin TX 78712 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Image segmentation; video segmentation; deep learning; foreground segmentation;

机译：图像分割视频分割;深度学习前景分割;

相似文献

外文文献
中文文献
专利

1. SPFTN: A Joint Learning Framework for Localizing and Segmenting Objects in Weakly Labeled Videos [J] . IEEE Transactions on Pattern Analysis and Machine Intelligence . 2020,第2期

机译：SPFTN：一种联合学习框架，用于对弱标签视频中的对象进行本地化和分段
2. A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG imagery [J] . Duro D.C., Franklin S.E., Dubé M.G. Remote Sensing of Environment: An Interdisciplinary Journal . 2012,第Null期

机译：使用SPOT-5 HRG影像对基于像素和基于对象的图像分析与选定的机器学习算法进行农业景观分类的比较
3. Semi-automatic object-based video segmentation with labeling of color segments [J] . Ioannis Patras, Emile A. Hendriks, Reginald L. Lagendijk Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2003,第1期

机译：半自动基于对象的视频分段，带有颜色分段标签
4. FusionSeg: Learning to Combine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Videos [C] . Suyog Dutt Jain, Bo Xiong, Kristen Grauman IEEE Conference on Computer Vision and Pattern Recognition . 2017

机译：FusionSeg：学习将运动与外观结合起来以对视频中的通用对象进行全自动分割
5. A novel deep learning method for object detection in multispectral images via pixel rearrangement [D] . Kasi Vishwanathan, Anusha. 2018

机译：一种通过像素重排的多光谱图像目标检测的新型深度学习方法
6. ALMI—A Generic Active Learning System for Computational Object Classification in Marine Observation Images [O] . Torben Möller, Tim W. Nattkemper 2021

机译：ALMI-A用于船舶观测图像中的计算对象分类的通用主动学习系统
7. Segmenting salient objects from images and videos [O] . Esa Rahtu, Juho Kannala, Mikko Salo 2010

机译：从图像和视频中分割显着的对象

Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos

摘要

著录项

相似文献

相关主题

期刊订阅