About Me
I am a research scientist in the Data Sciences Platform at the Broad Institute. I run the Long Read Methods and Applications group which develops novel tools and applications for single molecule sequencing instruments.
Prior to this, I was a DPhil student (2011-2016) and post-doc (2016-2018) at the University of Oxford under the guidance of Prof. Gil McVean. I worked on combining long-read and short-read data to improve de novo assembly. I applied these methods to de novo mutation discovery in malaria experimental crosses.
Software
Longbow is a processing toolkit for Multiplexed ArrayS sequencing (MAS-seq) long read data. It is primarily used to annotate adapter and transcript boundaries within multiplexed long reads (termed “arrays”), segment annotated reads, filter out malformed arrays, and prepare output for use with other downstream tools.
A set of cloud-based workflows for processing (mostly human) whole genome and transcriptome long read data. Includes GPU-enabled basecalling, error correction, alignment, variant calling, de novo assembly, etc. Constantly in-process and improving.
Corticall uses long read data on parents to improve de novo assemblies of short read data on children, and applies a recombination-aware alignment model to discover point, indel, and structural variants. Corticall thus enables discovery of mutations in regions of the genome typically inaccessible to short reads alone. It is part of the CortexJDK package which provides a Java API for the access and manipulation of Cortex/McCortex multi-color linked de Bruijn graphs.
Selected publications
A.M. Al'Khafaji, J.T. Smith, K.V. Garimella, M. Babadi, M. Sade-Feldman, M. Gatzen, S. Sarkizova, M.A. Schwartz, V. Popic, E.M. Blaum, A. Day, M. Costello, T. Bowers, S. Gabriel, E. Banks, A.A. Philippakis, G.M. Boland, P.C. Blainey, N. Hacohen, “High-throughput RNA isoform sequencing using programmable cDNA concatenation”, bioRxiv 2021.10.01.462818; doi: https://doi.org/10.1101/2021.10.01.462818
K.V. Garimella, Z. Iqbal, M.A. Krause, S. Campino, M. Kekre, E. Drury, D. Kwiatkowski, J.M. Sá, T.E. Wellems, G. McVean, “Detection of simple and complex de novo mutations with multiple reference sequences.” Genome research, 30(8), pp.1154-1169., Aug. 2020
I. Turner, K. V. Garimella, Z. Iqbal, and G. McVean, “Integrating long-range connectivity information into de Bruijn graphs.,” Bioinformatics, vol. 34, no. 15, pp. 2556–2565, Aug. 2018.
M. Cretu-Stancu, K. V. Garimella, M. Fromer, Genome of the Netherlands consortium, K. E. Samocha, B. M. Neale, M. J. Daly, E. Banks, M. A. DePristo, P. I. de Bakker, M. A. Swertz, L. C. Francioli, W. P. Kloosterman, C. M. van Duijn, and D. I. Boomsma, “A framework for the detection of de novo mutations in family-based sequencing data.,” Eur. J. Hum. Genet., vol. 25, no. 2, pp. 227–233, Feb. 2017.
A. Kiezun, K. Garimella, R. Do, N. O. Stitziel, B. M. Neale, P. J. McLaren, N. Gupta, P. Sklar, P. F. Sullivan, J. L. Moran, C. M. Hultman, P. Lichtenstein, P. Magnusson, T. Lehner, Y. Y. Shugart, A. L. Price, P. I. W. de Bakker, S. M. Purcell, and S. R. Sunyaev, “Exome sequencing and the genetic basis of complex traits.,” Nat Genet, vol. 44, no. 6, pp. 623–630, Jun. 2012.
T. J. Dixon-Salazar, J. L. Silhavy, N. Udpa, J. Schroth, S. Bielas, A. E. Schaffer, J. Olvera, V. Bafna, M. S. Zaki, G. H. Abdel-Salam, L. A. Mansour, L. Selim, S. Abdel-Hadi, N. Marzouki, T. Ben-Omran, N. A. Al-Saana, F. M. Sonmez, F. Celep, M. Azam, K. J. Hill, A. Collazo, A. G. Fenstermaker, G. Novarino, N. Akizu, K. V. Garimella, C. Sougnez, C. Russ, S. B. Gabriel, and J. G. Gleeson, “Exome sequencing can improve diagnosis and alter patient management.,” Science Translational Medicine, vol. 4, no. 138, pp. 138ra78–138ra78, Jun. 2012.
M. A. DePristo, E. Banks, R. Poplin, K. V. Garimella, J. R. Maguire, C. Hartl, A. A. Philippakis, G. del Angel, M. A. Rivas, M. Hanna, A. McKenna, T. J. Fennell, A. M. Kernytsky, A. Y. Sivachenko, K. Cibulskis, S. B. Gabriel, D. Altshuler, and M. J. Daly, “A framework for variation discovery and genotyping using next-generation DNA sequencing data.,” Nat Genet, vol. 43, no. 5, pp. 491–498, May 2011.
A Little More About Me
Alongside my interests in extreme genomic diversity, graphical models, and software engineering, some of my other interests are:
- Banjo
- Sculling
- Making the perfect iced latte
- That this exists