V3dr is the first benchmark for quantitative evaluation of mocap retrieval. V3dr uses 2000 files from CMU-mocap as database and provides a set of video queries. For a more detailed explanation of the benchmark, check out our paper below.
Video queries taken from YouTube with their top retrieved mocap sequences.
To evaluate retrieval, we provide annotated mocap sequences with action labels. We choose 8 day-to-day action classes (pick-up, sit-down, get-up, walk, punch, kick, throw) and annotate 4.5 hours of mocap data per-frame with the above action labels.
We also provide a total of 320 short video sequences as queries that feature the same actions used for annotating mocap. The queries are taken from two sources: the INRIA XMAS dataset and YouTube.
We randomly pick 160 sequences from the IXMAS dataset. These videos have plain backgrounds and uniform clothing, but the viewpoints vary considerably.
We add another 160 query videos from YouTube. These queries have little variation in viewpoint, but the clothing and backgrounds are realistic.
The CMU mocap dataset can be downloaded from github or from google sites.
The videos from the IXMAS dataset can be downloaded from the INRIA page. A subset of queries that we use as a part of the benchmark are available here.
Another set of video queries downloaded from Youtube are available here.
If you have questions or comments, please feel free to email us