Raghav Goyal

PhD Candidate (Fall, 2019 - Present)
| in Computer Vision and Machine Learning
| at the University of British Columbia and Vector Institute
| advised by Prof. Leonid Sigal

Email: rgoyal14 (at) cs.ubc.ca
Other: Google Scholar | Github | Blog | LinkedIn

My research has been centered around exploring and developing efficient learning (few-, weak-) approaches to tackle structured output tasks (detection, segmentation) across images and videos.
During my PhD, I've spent time as an intern at Google Research (2023) and Meta AI (2022) working on video understanding.
Prior to this I spent three years at Twenty Billion Neurons GmbH (20bn) in Berlin, Germany; working on video understanding under the supervision of Roland Memisevic, PhD.
I obtained Integrated M.Tech. (5-year programme) from IIT Delhi with Mathematics and Computing as my major.

* I'm looking for full-time industrial roles. If you see a fit, please reach out!

Publications

	TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking Raghav Goyal, Wan-Cyuan Fan, Mennatullah Siam, Leonid Sigal (* equal contribution) In arXiv. 2312.08514. [pdf]

	MINOTAUR: Multi-task Video Grounding From Multimodal Queries Raghav Goyal, Effrosyni Mavroudi, Xitong Yang, Sainbayar Sukhbaatar, Leonid Sigal, Matt Feiszli, Lorenzo Torresani, Du Tran In arXiv. 2302.08063. [pdf]

	A Simple Baseline for Weakly-Supervised Human-centric Relation Detection Raghav Goyal, Leonid Sigal In BMVC. Virtual. 2021. [pdf]

	UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation Siddhesh Khandelwal, Raghav Goyal, Leonid Sigal (* equal contribution) In CVPR. Virtual. 2021. [pdf]

	Improved Few-Shot Visual Classification Peyman Bateni, Raghav Goyal, Vaden Masrani, Frank Wood, Leonid Sigal In CVPR. Seattle, USA. 2020. [pdf]

	Evaluating visual "common sense" using fine-grained classification and captioning tasks Raghav Goyal, Farzaneh Mahdisoltani, Guillaume Berger, Waseem Gharbieh, Ingo Bax, Roland Memisevic In ICLR Workshop. Vancouver, Canada. 2018. [pdf]

	The "something something" video database for learning and evaluating visual common sense Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, , Ingo Bax, Roland Memisevic ( see paper for additional authors) In ICCV. Venice, Italy. 2017. [pdf] [supp] [code] [data]

	Natural Language Generation through Character-based RNNs with Finite-state Prior Knowledge Raghav Goyal, Marc Dymetman, Eric Gaussier In COLING. Osaka, Japan. 2016. [pdf]

ML Challenges

(Sep, 2018) Placed 3rd in Visual Dialog challenge hosted as a part of SIVL workshop at ECCV'18. Rankings can be found here.
(Jul, 2017) Placed 3rd in the Kinetics video recognition challenge, hosted by DeepMind at ActivityNet workshop at CVPR'17, with our approach detailed in this blog post.

Miscellaneous

Reviewer: ECCV'20, ICLR'21, CVPR'21