论文标题
pico+:对比标签的差异差异
PiCO+: Contrastive Label Disambiguation for Robust Partial Label Learning
论文作者
论文摘要
部分标签学习(PLL)是一个重要的问题,它允许每个训练示例都用粗糙的候选套件标记,这非常适合许多现实世界中的数据注释方案,标签歧义。尽管有希望,但PLL的表现通常落后于受监督的对应者。在这项工作中,我们通过在一个连贯的框架中解决了PLL的两个关键研究挑战(表示和标签歧义),来弥合差距。具体而言,我们提出的框架PICO由一个对比度学习模块以及一种新型基于类原型的标签歧义算法组成。 PICO为来自同一类的示例提供了紧密对齐的表示形式,并促进了标签歧义。从理论上讲,我们表明这两个组成部分是互惠互利的,并且可以从期望最大化(EM)算法的角度来严格地证明是合理的。此外,我们研究了一个具有挑战性但实用的嘈杂的部分标签学习设置,其中可能不包括在候选人集中。为了解决此问题,我们提出了一个扩展PICO+,该扩展可以执行基于距离的干净样品选择,并通过半监督的对比度学习算法来学习强大的分类器。广泛的实验表明,我们提出的方法在标准和嘈杂的PLL任务中的最新方法明显优于与完全监督学习的可比结果。
Partial label learning (PLL) is an important problem that allows each training example to be labeled with a coarse candidate set, which well suits many real-world data annotation scenarios with label ambiguity. Despite the promise, the performance of PLL often lags behind the supervised counterpart. In this work, we bridge the gap by addressing two key research challenges in PLL -- representation learning and label disambiguation -- in one coherent framework. Specifically, our proposed framework PiCO consists of a contrastive learning module along with a novel class prototype-based label disambiguation algorithm. PiCO produces closely aligned representations for examples from the same classes and facilitates label disambiguation. Theoretically, we show that these two components are mutually beneficial, and can be rigorously justified from an expectation-maximization (EM) algorithm perspective. Moreover, we study a challenging yet practical noisy partial label learning setup, where the ground-truth may not be included in the candidate set. To remedy this problem, we present an extension PiCO+ that performs distance-based clean sample selection and learns robust classifiers by a semi-supervised contrastive learning algorithm. Extensive experiments demonstrate that our proposed methods significantly outperform the current state-of-the-art approaches in standard and noisy PLL tasks and even achieve comparable results to fully supervised learning.
