Sohrab Shah - UBC Computer Science

Please visit my new website at http://compbio.bccrc.ca

Introduction and array CGH

Welcome to Sohrab's website. I've moved. I'm now a Postdoctoral Research Fellow at the BC Cancer Agency/Dept of Pathology at UBC working on analysis of next generation sequencing data of ovarian and breast cancer genomes. I'm a former PhD student in the Department of Computer Science at the University of British Columbia where I worked on genomic profiling using array comparative genomic hybridization (aCGH). Specifically, I developed statistical models to analyse aCGH data. This work was done with Dr Raymond Ng and Dr Kevin Murphy - in collaboration with the BC Cancer Agency. This work falls under the broad umbrella of the field of bioinformatics -- an area I have been involved in for the last 10 years.

Selected Publications and Presentations [top]

Sohrab P. Shah, Ryan D. Morin, Jaswinder Khattra, Leah Prentice, Trevor Pugh, Angela Burleigh, Allen Delaney, Karen Gelmon, Ryan Giuliany, Janine Senz, Christian Steidl, Robert A. Holt, Steven Jones, Mark Sun, Gillian Leung,Richard Moore, Tesa Severson, Greg A. Taylor, Andrew E. Teschendorff, Kane Tse, Gulisa Turashvili, Richard Varhol, Rene L. Warren, Peter Watson, Yongjun Zhao, Carlos Caldas, David Huntsman, Martin Hirst, Marco A. Marra and Samuel Aparicio. Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution Nature. vol461, 809-813. (2009) [PDF]

Sohrab P. Shah, Ph.D., Martin Kobel, M.D., Janine Senz, B.Sc., Ryan D. Morin, M.Sc., Blaise A. Clarke, M.B., B.Ch., Kimberly C. Wiegand, B.Sc., Gillian Leung, B.Sc., Abdalnasser Zayed, B.Sc., Erika Mehl, B.M.L.Sc., Steve E. Kalloger, B.Sc., Mark Sun, B.Sc., Ryan Giuliany, Erika Yorida, B.M.L.Sc., Steven Jones, Ph.D., Richard Varhol, M.Sc., Kenneth D. Swenerton, M.D., Dianne Miller, M.D., Philip B. Clement, M.D., Colleen Crane, B.Tech., Jason Madore, M.Sc., Diane Provencher, M.D., Peter Leung, Ph.D., Anna DeFazio, Ph.D., Jaswinder Khattra, M.Sc., Gulisa Turashvili, M.D., Ph.D., Yongjun Zhao, M.Sc., D.V.M., Thomas Zeng, M.Sc., J.N. Mark Glover, Ph.D., Barbara Vanderhyden, Ph.D., Chengquan Zhao, M.D., Christine A. Parkinson, Ph.D., M.R.C.P., Mercedes Jimenez-Linan, Ph.D., David D.L. Bowtell, Ph.D., Anne-Marie Mes-Masson, Ph.D., James D. Brenton, M.D., F.R.C.P., Samuel A. Aparicio, B.M., B.Ch., Niki Boyd, Ph.D., Martin Hirst, Ph.D., C. Blake Gilks, M.D., Marco Marra, Ph.D., and David G. Huntsman, M.D. Mutation of FOXL2 in Granulosa-Cell Tumors of the Ovary New England Journal of Medicine. June 10, 2009 [FULLTEXT]

News:

Telegraph, UK

Forbes.com

Vancouver Sun
Editorial:
Cancer Genomes on a Shoestring Budget. Shedure and Stewart

Sohrab P. Shah, K-John Cheung, Jr, Nathalie A. Johnson, Guillaume Alain, Randy D. Gascoyne, Douglas E. Horsman, Raymond T. Ng and Kevin P. Murphy. Model-based clustering of array CGH data Bioinformatics 2009 25(12):i30-i38 [PDF] [SOFTWARE]

Shah SP. Computational methods for identification of recurrent copy number alteration patterns by array CGH. Cytogenetic and Genome Research In press.

Cheung KJ*, Shah SP*, Steidl C, Johnson N, Relander T, Telenius A, Lai B, Murphy KP, Lam W, Al-Tourah AJ, Connors JM, Ng RT, Gascoyne RD, Horsman DE. Genome-wide profiling of follicular lymphoma by array comparative genomic hybridization reveals prognostically significant DNA copy number imbalances. Blood. 2008 Aug 14.
(* - Contributed equally)

S P Shah, W L Lam, R T Ng, K P Murphy. Modeling recurrent DNA copy number alterations in array CGH data Bioinformatics 2007 Jul 1;23(13):i450-8 [Software] [PDF] [Talk slides from ISMB 2007]

S Shah, X Xuang, R DeLeeuw, M Khojasteh, W Lam, R Ng, K Murphy Integrating copy number polymorphisms into array CGH analysis using a robust HMM Bioinformatics 2006 Jul 15;22(14):e431-9 [Software] [PDF]

Mehrnoush Khojasteh, Bradley P. Coe, Sohrab Shah, Rabab K. Ward, Wan L. Lam, Calum MacAulay A Novel Algorithm for the Analysis of Array CGH Data ICASSP 2006

Kemmer D, Huang Y, Shah SP, Lim J, Brumm J, Yuen MM, Ling J, Xu T, Wasserman WW, Ouellette BF Ulysses - an application for the projection of molecular interactions across species. Genome Biol. 2005; 6(12): R106 [PDF][Web server]

Shah SP. Detecting common secondary structure elements in RNA sequences. MSc Thesis. May 2005 [PDF]

Shah SP, Huang Y, Xu T, Yuen MM, Ling J, Ouellette BF. Atlas - a data warehouse for integrative bioinformatics. BMC Bioinformatics. 2005 Feb 21;6(1):34 View
Abstract [Software]

Shah SP, He DY, Sawkins JN, Druce JC, Quon G, Lett D, Zheng GX, Xu T, Ouellette BF. Pegasys: software for executing and integrating analyses of biological sequences. BMC Bioinformatics 5(1):40. (2004).View Abstract [Software]

Shah SP,McVicker GP,Mackworth AK,Rogic S,Ouellette BF. GeneComber: combining outputs of gene prediction programs for improved results. Bioinformatics 19(10):1296-7. (2003).View
Abstract

Conference Presentations

Shah, SP et al. Integrating copy number polymorphisms into array CGH analysis using a robust HMM. ISMB 2006, Fortaleza, Brasil. (invited presentation).

Shah, SP. The Pegasys workflow management system for high-throughput sequence analysis. NETTAB 2005, Naples, Italy. (invited presentation).

Shah, SP et al. Genome Informatics, 2003 Pegasys: a Parallel Genome Annotation System. Cold Spring Harbour Laboratories, USA. (platform presentation)

Software [top]

CNA-HMMer Matlab toolbox for detecting copy number alterations in array CGH data

This is a comprehensive software package written in Matlab to detect copy number alterations from a single sample of array CGH data using a robust HMM and to detect recurrent alterations from a set of array CGH samples using a hierachical HMM. For more information and to download the software, click here.

Atlas Integrated Database System

The Atlas Integrated Database Project is an effort to integrate multiple forms of biological, publication, and ontological data under one query space for data mining. At its core sits relational databases whose tables model the data structure of the information the database is storing. When the project is complete there will be databases for Genbank, Gene Ontology, Taxonomy, Medline, Molecular Interaction (BIND/MINT/DIP/HPRD) and Gene expression that all sit under one query space. These databases, along with the accompanying data mining tools provide a platform to support research activities at the UBC Bioinformatics Center.

The Atlas software and data is freely available under the GNU General Public License. You can access them from the Atlas wepage: http://bioinformatics.ubc.ca/atlas

Pegasys: Workflow Management for Biological Sequence Analysis

The Pegasys system is designed to allow biologists to dynamically create workflows for sequence analysis, in particular for genome annotation. This system is currently being used to annotate the genome of a serotype of Crytptococcus neoformans - a fungal pathogen that causes meningitis in humans. For a detailed description of the Pegasys project, or to download the software, please visit: http://bioinformatics.ubc.ca/pegasys.

Teaching [top]

IHHS 302

I'm the instructor of a new multi-faculty, multi-department course in Health Informatics. The course is offered at UBC Okanagan Aug21-25 (not for credit), 2006 and UBC Vancouver Aug 28-Sept 1, 2006 (3 credits). This initiative is funded by the UBC Teaching and Learning Enhancement Fund (TLEF). The course outline is available here in PDF.

Canadian Bioinformatics Workshops Series

I'm an instructor for the Canadian Bioinformatics Workshops where I teach methods for sequence analysis and programming in Perl and Java. I've been involved in the CBW since 2001.

Previous work

Prior to my PhD work, I was involved in several research projects. Most recently, I completed my MSc degree with Dr. Anne Condon, during which I developed an algorithm for finding conserved secondary structures in a set of unaligned RNA sequences. I used stochastic context free grammar-based probabilistic models combined with expectation maximisation iterative refinement to detect conserved motifs. I'm currently preparing a manuscript on this work and will submit it for publication in the near future.

From 2000-2004, I worked in the lab of Francis Ouellette at the UBC Bioinformatics Centre where I developed a number of software systems for bioinformatics analysis (see Software).

Links [top]

Links - UBC

Links - Bioinformatics Resources

Links - Open Access Journals

Links - Statistical Computing

Contact Info

Sohrab Shah: sshah [at] cs [dot] ubc [dot] ca