Bayesian Formulations of Multiple Instance Learning with Applications to General Object Recognition

By Hendrik Kueck

We attack the problem of general object recognition by learning probabilistic, nonlinear object classifiers for a large number of object classes. The individual classifiers allow for detection and localization of objects belonging to a certain class in an image by classifying image regions into those that likely show such an object and those that do not. Instead of relying on expensive supervised training data, we propose an approach for learning such classifiers from annotated images. One major problem to overcome in this scenario is the ambiguity due to the unknown associations between annotation words and image regions. We propose a fully Bayesian learning technique for learning probabilistic classifiers, which deals with the ambiguity issue in a principled way by integrating out the uncertainty in both the unknown associations and the parameters of our probabilistic model. Our approach uses an extremely flexible kernel classifier trained with an efficient Markov Chain Monte Carlo technique.

A further problem in this setting is that of class imbalance, which we address in two different ways. Firstly, we propose a new problem formulation, in which we provide our training algorithm with additional information, namely estimates of the number of image regions showing the object class in the given training images. Additionally, we experiment with altering the distribution of the training data itself. Using only an automaticly and appropriately chosen subset of the available training images without the object class of interest leads to a more balanced class distribution as well as a remarkable speedup in training and testing. The value of the new techniques is demonstrated with synthetic and real image data sets.

Back to the LCI Forum page