Multimedia Robotics

The collection of data is far from a trivial task in multimedia research. Even though a vast array of media is immediately available on the world wide web, it is difficult to find large collections of consistently-labeled and properly organized data. Public and private collections of data are available, but they suffer from biases in the methods of collection and the labeling can often be quite poor. In the end, we need more general sets of images to properly verify our models. To aid us in the collection of such data, we are currently working on a control system for a self-guided robot photographer that requests intermittent feedback in order to semantically label the image data (the labelling could conceivably be done through speech recognition).

Our work in multimedia robots is not only about collecting data but is also of more general interest to the robotics community. We want to be able to interact with robots using sound, touch, speech and visual cues, and simulatenously we want the robot to learn from its interaction with humans.

This project is in collaboration with the Robot Partners Group at the Laboratory for Computational Intelligence.

Jose, the robot developed by the Robot Partners Group at UBC, and future photographer.

Segmentation for robots
Image segmentation has been an active area of research for a long time and it continues to receive a great deal of attention today (see research on BlobWorld, Normalized Cuts and Mean Shift). All image segmentation algorithms are inherently limited by the feature sets used as information for splitting the image into distinct information. The features are usually colour, texture and position. Establishing a real-world interactive system for the acquision of images is boon to segmentation since it allows us to tack on additional non-traditional information such as depth, sound and semantic information.

Another important constraint of segmentation for robots is that it has to be fast.

Image information from Jose's stereo input. Clockwise, starting from the top-left hand corner. The colour image, the segmentation using Normalized Cuts, the combined segmentation and colour image for viewing the results, and the depth information (darker means farther away).