Advanced DNA Sequencing for Uncovering Novel Inheritable Carcinogenic Mutations

ASPIRE Award (2019-2020)

Michael Schatz, PhD, Johns Hopkins University (Principal); Eliezer Van Allen, MD, Dana-Farber Cancer Institute

Michael Schatz, PhD

Eliezer Van Allen, MD

For individuals with a family history of cancer, early detection of shared mutations in genes known to drive cancer is critical for disease prevention and early diagnosis. However, many cancers that appear at increased rates within families have no known genetic explanation. This may be because the heritability of many cancers is due to a class of mutations called structural variations (SVs), which are made up of longer stretches of DNA that are added, missing, or rearranged within a gene. Cancer-causing SVs are not easily detected by standard genetic sequencing methods that only read and assemble short stretches of DNA, so the role of SVs in cancer heritability is not well understood. Identifying heritable SVs that lead to cancer will immediately allow for improved cancer screening and diagnostics within families with inexplicably high rates of cancer. It may also provide general insight into the heritability of many cancers. Researchers in the Schatz and Van Allen labs have teamed up to develop and utilize AI-assisted, genetic sequencing technology that can read and assemble long stretches of DNA and detect SVs with accuracy unparalleled by other sequencing technologies. The team has already recruited several families with high rates of cancers that have no known genetic causes and plans to continuously add more to increase the breadth of data they collect. They will analyze both tumor and healthy tissue samples from members of this cohort to identify SVs that are more likely to occur in those with cancer. They expect this research will clarify the role of many SVs in heritable cancers. Not only would this increase the diagnostic power of genetic sequencing, it may also help researchers discover new drug targets and uncover novel disease pathways.


Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, Aganezov S, Hoyt SJ, Diekhans M, Logsdon GA, Alonge M, Antonarakis SE, Borchers M, Bouffard GG, Brooks SY, Caldas GV, Chen NC, Cheng H, Chin CS, Chow W, de Lima LG, Dishuck PC, Durbin R, Dvorkina T, Fiddes IT, Formenti G, Fulton RS, Fungtammasan A, Garrison E, Grady PGS, Graves-Lindsay TA, Hall IM, Hansen NF, Hartley GA, Haukness M, Howe K, Hunkapiller MW, Jain C, Jain M, Jarvis ED, Kerpedjiev P, Kirsche M, Kolmogorov M, Korlach J, Kremitzki M, Li H, Maduro VV, Marschall T, McCartney AM, McDaniel J, Miller DE, Mullikin JC, Myers EW, Olson ND, Paten B, Peluso P, Pevzner PA, Porubsky D, Potapova T, Rogaev EI, Rosenfeld JA, Salzberg SL, Schneider VA, Sedlazeck FJ, Shafin K, Shew CJ, Shumate A, Sims Y, Smit AFA, Soto DC, Sović I, Storer JM, Streets A, Sullivan BA, Thibaud-Nissen F, Torrance J, Wagner J, Walenz BP, Wenger A, Wood JMD, Xiao C, Yan SM, Young AC, Zarate S, Surti U, McCoy RC, Dennis MY, Alexandrov IA, Gerton JL, O’Neill RJ, Timp W, Zook JM, Schatz MC, Eichler EE, Miga KH, Phillippy AM. The complete sequence of a human genome. Science. 2022.

Aganezov S, Yan SM, Soto DC, Kirsche M, Zarate S, Avdeyev P, Taylor DJ, Shafin K, Shumate A, Xiao C, Wagner J, McDaniel J, Olson ND, Sauria MEG, Vollger MR, Rhie A, Meredith M, Martin S, Lee J, Koren S, Rosenfeld JA, Paten B, Layer R, Chin CS, Sedlazeck FJ, Hansen NF, Miller DE, Phillippy AM, Miga KH, McCoy RC, Dennis MY, Zook JM, Schatz MC. A complete reference genome improves analysis of human genetic variation. Science. 2022.

Altemose N, Logsdon GA, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Lucas JK, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O’Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH. Complete genomic and epigenetic maps of human centromeres. Science. 2022.

Hoyt SJ, Storer JM, Hartley GA, Grady PGS, Gershman A, de Lima LG, Limouse C, Halabian R, Wojenski L, Rodriguez M, Altemose N, Rhie A, Core LJ, Gerton JL, Makalowski W, Olson D, Rosen J, Smit AFA, Straight AF, Vollger MR, Wheeler TJ, Schatz MC, Eichler EE, Phillippy AM, Timp W, Miga KH, O’Neill RJ. From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science. 2022.

Das A, Schatz MC. Sketching and sampling approaches for fast and accurate long read classification. BMC Bioinformatics. 2022.

Kirsche M, Prabhu G, Sherman R, Ni B, Battle A, Aganezov S, Schatz MC. Jasmine and Iris: population-scale structural variant comparison and analysis. Nat Methods. 2023.