How Computer Vision and ML are advancing Neuroscience: Nature Methods journal showcases UBC researcher’s work

As Neurology continues to unravel the mysteries of our brain and resulting behaviours, there is one discipline that has greatly accelerated the progression of neurological research: computer science.

In recent decades, advances in machine learning and computer vision have been key contributors, thanks to researchers like Dr. Helge Rhodin of the University of British Columbia’s (UBC) Computer Science Department.

Dr. Rhodin and his interdisciplinary collaborators have achieved an impressive milestone in this area with 3D Pose Estimation research, and have thereby gained international exposure.

Nature Methods, one of the world’s most prestigious and most read academic journals, has published two papers authored by Rhodin and his colleagues, both within the past six months, about 3D Pose Estimation for laboratory animals. The papers are a result of years of interdisciplinary work amongst experts within Computer Vision, Machine Learning and Neurology, relating to 3D poses of humans and animals, utilizing machine learning techniques.

Digitizing the mouse

Historically, capturing 3D poses of laboratory animals has been cumbersome, with the need for multiple cameras and labour-intensive effort all within a controlled environment. The results were fairly broad estimations. However, the methodology using machine learning conducted by Dr. Rhodin and his co-authors has resulted in 3D captures that are exceptionally accurate and overcome previous barriers. The technique is also easily adoptable by a wide variety of laboratory studies, across numerous types of environments.

"Computer Vision is finding a prominent place within neuroscience." ~ Dr. Helge Rhodin, UBC Computer Science

The first of the two papers to appear in Nature Methods (April 2021) details research conducted by Rhodin and his collaborators from UBC Psychiatry and the UBC Department of Oral Biological and Medical Sciences, entitled: A three-dimensional virtual mouse generates synthetic training data for behavioral analysis.

What is the difference between 2D and 3D Pose Estimation?

2D Pose Estimation is predicting the location of body joints in an image,
in pixel values. 3D Pose Estimation is predicting a three-dimensional spatial arrangement of all the body joints that is suitable for driving a computer graphics model, e.g., animating the developed virtual mouse model.

They created a synthetic three-dimensional animated mouse based on computed cross-sectional scans, to generate synthetic behavioural data with ‘ground-truth’ label locations. From this data, the researchers created synthetic videos of realistic-acting mice, then used them to train 2D and 3D pose estimation models with great accuracy. The results were as accurate as results from typical manual training datasets, and the 3D pose estimations yielded much better definition of behavioral clusters than 2D videos.

“Using a machine learning technique called Unpaired Image Translation, we can adapt a domain and easily move the synthetic data to an entirely different synthetic environment,” Helge explained. “The old method of capturing data was much more restrictive and non-adaptable.”

The paper concludes: The 3D model space could be thought of as a mouse common behavioral framework, representing movements in a space less subject to variation due to camera placement, lighting and other scene-related factors than typical laboratory recording conditions.

Figure 1: Demonstration in a multi-camera synthetic setup with shared labels suitable for machine learning training.

Digitizing fruit flies and other animals

The second paper co-authored by Dr. Rhodin in Nature Methods was published August 2021, entitled: LiftPose3D, a deep learning-based approach for transforming two-dimensional to three-dimensional poses in laboratory animals.

In kinematic studies of lab animals, marker-less 3D Pose Estimation has become the gold standard. But most methods require multiple synchronized cameras and elaborate protocols that hinder their widespread adoption.

“LiftPose3D overcomes these hindrances by reconstructing 3D poses from a single 2D camera and mirror,” said Helge. “It works where 3D triangulation cannot, and has been successful with a variety of experimental animal types like fruit flies, mice, rats and macaques.” Helge explains that LiftPose3D enables high-quality 3D pose estimation in the absence of complex camera arrays and tedious calibration of different camera angles. This ease of capture process makes analysis and manipulation easier, as well as faster.

3D camera setup

Figure 2: 3D Pose Estimation camera setup

Helge credits his collaborators from École polytechnique fédérale de Lausanne, his previous place of employment before coming to join UBC. “Grad student Semih Günel and postdoc Adam Gosztolai are the ones who realized our vision.” Helge’s previous work in this domain served to inform and progress this area of research. And he continues to collaborate with researchers in Neuroscience, saying, “I am still doing follow-up work with my collaborators, giving the synthetic mouse more fine-grained details for enabling a dense reconstruction.”

Helge also co-organized a workshop at the Computer Vision Conference (CVC) 2021, with his Canadian, US and European colleagues. “We called the workshop Computer Vision for Animals, and it was a mini-conference in itself with 150 participants,” Helge explained. “It was a great success, with the leading computer vision and neuroscience experts all in one place. We want to establish the workshop as an annual event.”

The vision is clear

Helge explains how their research furthers neuroscience. “The big application with 3D Pose Estimation research is the enablement of better monitoring of the animal’s behaviour. Neuroscience researchers know how to capture the neural activity with appropriate devices, but they lack an easy setup for behaviour analysis. And the easier it is, the more it can be scaled up to different animal types.” Helge says that with machine learning, they can aim for a setup that’s as simple as possible and requires the least amount of data, as opposed to classical techniques. “If you were to change the light source or the breed of mouse, the results would be inaccurate. You would have to retrain the simulation entirely.”

Their 3D Pose Estimation research, with machine learning applied, means the image becomes more realistic looking and the simulator generates a training dataset that makes it totally adaptable.

“Computer Vision is finding a prominent place within neuroscience,” Helge said. “I am very happy about that.”