Raghav Goyal

PhD Candidate (Fall, 2019 - Present)
| in Computer Vision and Machine Learning
| at the University of British Columbia and Vector Institute
| advised by Prof. Leonid Sigal

Email: rgoyal14 (at) cs.ubc.ca
Other: Google Scholar | Github | Blog | LinkedIn

My research has been centered around exploring and developing efficient learning (few-, weak-) approaches to tackle structured output tasks (detection, segmentation) across images and videos.
During my PhD, I've spent time as an intern at Google Research (2023) and Meta AI (2022) working on video understanding.
Prior to this I spent three years at Twenty Billion Neurons GmbH (20bn) in Berlin, Germany; working on video understanding under the supervision of Roland Memisevic, PhD.
I obtained Integrated M.Tech. (5-year programme) from IIT Delhi with Mathematics and Computing as my major.


Publications


Extending Video Masked Autoencoders to 128 frames
Nitesh B. Gundavarapu*, Luke Friedman*, Raghav Goyal*, Chaitra Hegde*, Eirikur Agustsson, Sagar M. Waghmare,
Mikhail Sirotenko, Ming-Hsuan Yang, Tobias Weyand, Boqing Gong, Leonid Sigal (* equal contribution)
In NeurIPS. Vancouver, Canada. 2024. [pdf]


TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking
Raghav Goyal*, Wan-Cyuan Fan*, Mennatullah Siam, Leonid Sigal (* equal contribution)
In WACV. Tucson, USA. 2025. [pdf]


MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal, Effrosyni Mavroudi, Xitong Yang, Sainbayar Sukhbaatar, Leonid Sigal, Matt Feiszli, Lorenzo Torresani, Du Tran
In arXiv. 2302.08063. [pdf]

A Simple Baseline for Weakly-Supervised Human-centric Relation Detection
Raghav Goyal, Leonid Sigal
In BMVC. Virtual. 2021. [pdf]

UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation
Siddhesh Khandelwal*, Raghav Goyal*, Leonid Sigal (* equal contribution)
In CVPR. Virtual. 2021. [pdf]

Improved Few-Shot Visual Classification
Peyman Bateni, Raghav Goyal, Vaden Masrani, Frank Wood, Leonid Sigal
In CVPR. Seattle, USA. 2020. [pdf]

Evaluating visual "common sense" using fine-grained classification and captioning tasks
Raghav Goyal, Farzaneh Mahdisoltani, Guillaume Berger, Waseem Gharbieh, Ingo Bax, Roland Memisevic
In ICLR Workshop. Vancouver, Canada. 2018. [pdf]


The "something something" video database for learning and evaluating visual common sense
Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, *, Ingo Bax, Roland Memisevic (* see paper for additional authors)
In ICCV. Venice, Italy. 2017. [pdf] [supp] [code] [data]



Natural Language Generation through Character-based RNNs with Finite-state Prior Knowledge
Raghav Goyal, Marc Dymetman, Eric Gaussier
In COLING. Osaka, Japan. 2016. [pdf]

ML Challenges


Miscellaneous