In this paper we compare the performance of a number of multiple-instance learning (MIL) and group based (GB) classification algorithms on both a synthetic and real-world Pap smear dataset. We utilise the synthetic dataset to demonstrate that performance improves as both bag size and percent positives increase and that MIL outperforms GB algorithms when the percentage positives is less than 50%. However, as the positive bags become increasingly homogeneous, as is apparent on the real-world dataset, the two approaches become comparable. This result highlights that the performance of a MIL or GB algorithm will be maximised when the algorithm's MIL assumption matches the reality of the dataset. Therefore, on the Pap smear dataset, algorithms with a more generalised MIL assumption demonstrate the strongest performance.
展开▼