Clinical DNA Variant Interpretation: Theory and Practice
()
About this ebook
- Compiles best practices, methods and sound evidence for DNA variant classification in one applied volume
- Features chapter contributions from international leaders in the field
- Includes practical examples of variant classification for common and rare disorders, and across clinical phenotypes
Related to Clinical DNA Variant Interpretation
Related ebooks
Human Genome Informatics: Translating Genes into Health Rating: 0 out of 5 stars0 ratingsGenomics of Rare Diseases: Understanding Disease Genetics Using Genomic Approaches Rating: 0 out of 5 stars0 ratingsClinical Genome Sequencing: Psychological Considerations Rating: 0 out of 5 stars0 ratingsCancer Genomics: From Bench to Personalized Medicine Rating: 0 out of 5 stars0 ratingsTranslational Bioinformatics and Systems Biology Methods for Personalized Medicine Rating: 0 out of 5 stars0 ratingsClinical Genomics Rating: 5 out of 5 stars5/5Advances in Cell and Molecular Diagnostics Rating: 5 out of 5 stars5/5Epigenetics in Precision Medicine Rating: 2 out of 5 stars2/5Epigenetic Gene Expression and Regulation Rating: 3 out of 5 stars3/5Pan-genomics: Applications, Challenges, and Future Prospects Rating: 0 out of 5 stars0 ratingsEmery and Rimoin’s Principles and Practice of Medical Genetics and Genomics: Clinical Principles and Applications Rating: 0 out of 5 stars0 ratingsHandbook of Clinical Adult Genetics and Genomics: A Practice-Based Approach Rating: 0 out of 5 stars0 ratingsStem Cell Epigenetics Rating: 0 out of 5 stars0 ratingsChallenges in Delivery of Therapeutic Genomics and Proteomics Rating: 0 out of 5 stars0 ratingsBioengineering Innovative Solutions for Cancer Rating: 0 out of 5 stars0 ratingsGenome Plasticity in Health and Disease Rating: 0 out of 5 stars0 ratingsPrinciples and Applications of Molecular Diagnostics Rating: 5 out of 5 stars5/5Epigenetic Technological Applications Rating: 0 out of 5 stars0 ratingsEmery and Rimoin’s Principles and Practice of Medical Genetics and Genomics: Foundations Rating: 0 out of 5 stars0 ratingsEmery and Rimoin’s Principles and Practice of Medical Genetics and Genomics: Hematologic, Renal, and Immunologic Disorders Rating: 0 out of 5 stars0 ratingsMicrobiology and Molecular Diagnosis in Pathology: A Comprehensive Review for Board Preparation, Certification and Clinical Practice Rating: 0 out of 5 stars0 ratingsGenomic and Precision Medicine: Cardiovascular Disease Rating: 0 out of 5 stars0 ratingsGenomic and Precision Medicine: Infectious and Inflammatory Disease Rating: 0 out of 5 stars0 ratingsMetagenomics for Microbiology Rating: 5 out of 5 stars5/5Diagnostic Molecular Biology Rating: 0 out of 5 stars0 ratingsCancer Treatment and the Ovary: Clinical and Laboratory Analysis of Ovarian Toxicity Rating: 0 out of 5 stars0 ratingsTranslational Medicine: Tools And Techniques Rating: 0 out of 5 stars0 ratingsThe Use of Mass Spectrometry Technology (MALDI-TOF) in Clinical Microbiology Rating: 0 out of 5 stars0 ratingsGenome Stability: From Virus to Human Application Rating: 0 out of 5 stars0 ratingsImmunoinformatics of Cancers: Practical Machine Learning Approaches Using R Rating: 0 out of 5 stars0 ratings
Medical For You
Gut: The Inside Story of Our Body's Most Underrated Organ (Revised Edition) Rating: 4 out of 5 stars4/5Mating in Captivity: Unlocking Erotic Intelligence Rating: 4 out of 5 stars4/5The Hormone Reset Diet: Heal Your Metabolism to Lose Up to 15 Pounds in 21 Days Rating: 4 out of 5 stars4/5The Diabetes Code: Prevent and Reverse Type 2 Diabetes Naturally Rating: 4 out of 5 stars4/5What Happened to You?: Conversations on Trauma, Resilience, and Healing Rating: 4 out of 5 stars4/5The Obesity Code: the bestselling guide to unlocking the secrets of weight loss Rating: 4 out of 5 stars4/5The 40 Day Dopamine Fast Rating: 4 out of 5 stars4/5Tight Hip Twisted Core: The Key To Unresolved Pain Rating: 4 out of 5 stars4/5As Nature Made Him: The Boy Who Was Raised as a Girl Rating: 4 out of 5 stars4/5WomanCode: Perfect Your Cycle, Amplify Your Fertility, Supercharge Your Sex Drive, and Become a Power Source Rating: 4 out of 5 stars4/5ATOMIC HABITS:: How to Disagree With Your Brain so You Can Break Bad Habits and End Negative Thinking Rating: 5 out of 5 stars5/5The Vagina Bible: The Vulva and the Vagina: Separating the Myth from the Medicine Rating: 5 out of 5 stars5/5Mediterranean Diet Meal Prep Cookbook: Easy And Healthy Recipes You Can Meal Prep For The Week Rating: 5 out of 5 stars5/5Adult ADHD: How to Succeed as a Hunter in a Farmer's World Rating: 4 out of 5 stars4/5Summary of Dr. Gundry's Diet Evolution: Turn off the Genes That Are Killing You and Your Waistline Rating: 3 out of 5 stars3/5Peptide Protocols: Volume One Rating: 4 out of 5 stars4/5Holistic Herbal: A Safe and Practical Guide to Making and Using Herbal Remedies Rating: 4 out of 5 stars4/5Women With Attention Deficit Disorder: Embrace Your Differences and Transform Your Life Rating: 5 out of 5 stars5/5The Song of the Cell: An Exploration of Medicine and the New Human Rating: 4 out of 5 stars4/5Living Daily With Adult ADD or ADHD: 365 Tips o the Day Rating: 5 out of 5 stars5/5"Cause Unknown": The Epidemic of Sudden Deaths in 2021 & 2022 Rating: 5 out of 5 stars5/5The Amazing Liver and Gallbladder Flush Rating: 5 out of 5 stars5/5Working Stiff: Two Years, 262 Bodies, and the Making of a Medical Examiner Rating: 4 out of 5 stars4/5Herbal Healing for Women Rating: 4 out of 5 stars4/5
Related categories
Reviews for Clinical DNA Variant Interpretation
0 ratings0 reviews
Book preview
Clinical DNA Variant Interpretation - Conxi Lázaro
Clinical DNA Variant Interpretation
Theory and Practice
Editors
Conxi Lázaro
Jordan Lerner-Ellis
Amanda Spurdle
Table of Contents
Cover image
Title page
Copyright
Dedication
Contributors
Foreword: the challenge of variant interpretation
About the editors
Chapter 1. Introduction: the challenge of genomic DNA interpretation
Section I. Theoretical chapters
Chapter 2. General considerations: terminology and standards
Introduction
Genetic variation
Standards on describing genetic variation
Variant classification
Standards on reporting disorders and phenotypes
Challenges and considerations
Conclusions
Chapter 3. International consensus guidelines for constitutional sequence variant interpretation
Historical variant interpretation approaches
Current variant classification practices: the 2015 ACMG/AMP guideline for sequence variant interpretation
Ongoing and future adaptations of the ACMG/AMP guidelines
Summary
Chapter 4. Quantitative modeling: multifactorial integration of data
Overview of quantitative modeling for variant interpretation
Derivation of likelihood ratios
Components of quantitative models
Caveats and considerations
Chapter 5. Clinical and genetic evidence and population evidence
Introduction
Population allele frequency
Molecular pathology
Mosaicism
Conclusion
Chapter 6. The computational approach to variant interpretation: principles, results, and applicability
Pathogenicity predictors for amino acid sequence variants
Computational predictors for variants affecting splicing
Chapter 7. Functional evidence (I) transcripts and RNA-splicing outline
Introduction
Splicing, alternative splicing events, and splicing isoforms: the splicing profile
Reference
transcript
Spliceogenic variants overlap cis-acting determinants of alternative splicing: short sequence motifs and long-range sequence features
Trans-acting and epigenetic determinants of alternative splicing
Roles of alternative splicing
Alternative splicing profile is dynamic
Spliceogenic variants: alternative splicing informs on the prior probability of being pathogenic
Splicing analyses: determining the spliceogenic impact of a genetic variant
Conclusion
Chapter 8. Functional evidence (II) protein and enzyme function
Historical background
The challenge of variants of uncertain significance
Assessment of variant pathogenicity
Prediction of variant effects: in silico tools
Functional assays
Validation and calibration
Example: BRCA1 and BRCA2
Example: DNA mismatch repair genes
Example: BLM
Example: RHO
Example: CFTR
High-throughput assays
In vivo assays
Conclusion
Chapter 9. Somatic data usage for classification of germ line variants
Introduction
Data sources
Control database for comparison
Laboratory practices utilizing somatic data
Principles and rationale for utilizing somatic data for classifying germ line variants in cancer predisposition genes
Loss of heterozygosity, determining biallelic inactivation, and cancer hot spots
Copy-neutral LOH
Determining biallelic inactivation
Mutational hot spots
RNA-seq tumor data
Tumor signatures
Germ line risk and variant pathogenicity informed from tumor signatures
Other considerations for integrating germ line and somatic data
Determining pathogenicity of alleles in genes with recessive and dominant phenotypes integrating population, somatic, and germ line data
Recognizing clonal evolution and specific somatic mutations in the context of predisposition
Chapter 10. Pharmacogenetics and personalized medicine
Introduction to pharmacogenetics and personalized medicine
Variant nomenclature in pharmacogenetics
Technologies for pharmacogenetic testing
Databases/resources for pharmacogenetics
Clinical guidelines and decision support tools in pharmacogenetics
Pharmacogenetics examples in clinical practice
Implementation of pharmacogenetic testing in clinical practice
Future perspectives of personalized medicine
Chapter 11. Data sharing and gene variant databases
Introduction
General databases
Focused databases
Final considerations
Chapter 12. Approaches to the comprehensive interpretation of genome-scale sequencing
Clinical applications of GS
Research applications of GS
Analysis of GS results for various applications
Criteria used for returning results of GS
Conclusion
Chapter 13. Phenotype evaluation and clinical context: application of case-level data in genomic variant interpretation
Introduction
Application of clinical and phenotypic information to variant interpretation and classification
Management of the patient based on the genomic data
Conclusions
Section II. Practical chapters
Chapter 14. Inherited cardiomyopathies
Introduction
Inherited heart diseases
Summary
Chapter 15. Phenylketonuria
Introduction
History of phenylketonuria
Clinical features
Evolution of genotyping
Practical genotype–phenotype correlation
Chapter 16. Hearing loss
Introduction
Disease sections: practical examples that highlight the main challenges of the molecular diagnosis of hearing loss
Conclusions
Chapter 17. Familial hypercholesterolemia
Variant interpretation in FH
Laboratory genetic testing for FH
Cases presentations
Main final conclusion
Chapter 18. Classification of genetic variants in hereditary cancer genes
Introduction
BRCA2 c.9976A>T p.(Lys3326Ter)
BRCA2 c.9117G>A
ATM c.9007_9034del
MLH1 c.2041G>A
Chapter 19. RASopathies
Introduction
Classification of variants associated with a RASopathy
General evidence criteria
Gene-specific evidence criteria
Case-level evidence criteria
Case examples
Summary
Chapter 20. Summary and conclusions
Future directions
Index
Copyright
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2021 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-820519-8
For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals
Publisher: Andre Gerhard Wolff
Acquisitions Editor: Peter B. Linsley
Editorial Project Manager: Kristi L. Anderson
Production Project Manager: Stalin Viswanathan
Cover Designer: Matt Limbert
Typeset by TNQ Technologies
Dedication
We dedicate this book to all of our colleagues in the field of variant interpretation whose perseverance and dedication have provided essential scientific knowledge to inform methods for improved interpretation of DNA variants, and thereby the use of genetic data in the context of the diagnosis of hereditary disorders, and predictive and personalized medicine.
The editors would like to thank the Elsevier team for offering us the possibility of writing this book. A special recognition to Peter B. Linsley for his invitation and to Kristi L. Anderson for her technical assistance.
Conxi Lázaro
I wish to dedicate this work to my mentors in the field of human and cancer genetics (Drs. Xavier Estivill, Virginia Nunes, and Gabriel Capellá) because they have been an inspiration to me throughout my scientific and professional career. I also dedicate it to all former and present members of my team for being a constant inspiration and for their work and enthusiasm. This book was planned and designed during my sabbatical year in Toronto. I would like to thank ICO-IDIBELL, my home Institutes in Barcelona; Mount Sinai Hospital, Sinai Health, and Womens College in Toronto, my host Institutes, as well as the Spanish Government of Health and Education for making this amazing sabbatical year possible.
Finally, I would like to thank all the public and private agencies from which we have obtained funding as well as the patients' associations who always encourage and support our research and make us aware of their needs and concerns.
Jordan Lerner-Ellis
I would like to dedicate this work to the broader community of passionate clinicians and researchers who have devoted their time to the interpretation of DNA variation. I thank my numerous mentors for their guidance and inspiration which have directed my interests in the field of human genetics. To my colleagues with whom I share countless hours working on clinical variant interpretation and research study. Finally to Conxi Lázaro who has been the driving force behind this book!
Amanda Spurdle
I dedicate this work to my mentor David Goldgar, who continues to inspire and question all I do in relation to variant interpretation methodology and implementation. The latter has directed my evidence-based approach to variant interpretation methods. I thank my many colleagues for their ongoing input and discussion, and thereby influences on my research on interpretation of genetic variants, and on improving approaches to disseminate such information for clinical benefit.
Contributors
Ana Catarina Alves
Cardiovascular Research Group, R&D Unit, Dept of Health Promotion and Prevention of Non-Communicable Diseases, National Institute of Health, Portugal
BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Portugal
Christina Anne Austin-Tse, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, United States
Mafalda Bourbon
Cardiovascular Research Group, R&D Unit, Dept of Health Promotion and Prevention of Non-Communicable Diseases, National Institute of Health, Portugal
BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Portugal
Marcelo A. Carvalho
Divisão de Pesquisa Clínica, Instituto Nacional de Câncer, Rio de Janeiro, Brazil
Instituto Federal do Rio de Janeiro - IFRJ, Rio de Janeiro, Brazil
Ozge Ceyhan-Birsoy, Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, United States
George S. Charames
Pathology and Lab Medicine, Mount Sinai Hospital, Toronto, Ontario, Canada
Lab Medicine and Pathobiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
Joana Rita Chora
Cardiovascular Research Group, R&D Unit, Dept of Health Promotion and Prevention of Non-Communicable Diseases, National Institute of Health, Portugal
BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Portugal
Mara Colombo, Unit of Molecular Bases of Genetic Risk and Genetic Testing, Department of Research Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, Italy
Xavier de la Cruz
Research Unit in Clinical and Translational Bioinformatics, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain
Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
Johan T. den Dunnen
Department of Human Genetics, Leiden University Medical Center, Leiden, South Holland, the Netherlands
Department of Clinical Genetics, Leiden University Medical Center, Leiden, South Holland, the Netherlands
Niels de Wind, Leiden University Medical Center, Leiden, the Netherlands
Orland Diez
Hereditary Cancer Genetics Group, Vall d’Hebron Institute of Oncology (VHIO), Vall d’Hebron Barcelona Hospital Campus, Barcelona, Spain
Area of Clinical and Molecular Genetics, Hospital Universitari Vall d’Hebron, Vall d’Hebron Barcelona Hospital Campus, Barcelona, Spain
Anna B.R. Elias, Divisão de Pesquisa Clínica, Instituto Nacional de Câncer, Rio de Janeiro, Brazil
D Gareth Evans
Clinical Genetics Service, Manchester Centre for Genomic Medicine, Manchester University Hospitals NHS Foundation Trust, Manchester, United Kingdom
Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
Lidia Feliubadaló, Molecular Diagnostics Unit, Hereditary Cancer Program, Catalan Institute of Oncology (ICO), Institut d’Investigació Biomèdica de Bellvitge (IDIBELL), ONCOBELL Program, Barcelona, Spain
Vanessa C. Fernandes, Divisão de Pesquisa Clínica, Instituto Nacional de Câncer, Rio de Janeiro, Brazil
Ivo F.A.C. Fokkema, Department of Human Genetics, Leiden University Medical Center, Leiden, South Holland, the Netherlands
Cristina Fortuno, Genetics and Computational Division, QIMR Berghofer Medical Research Institute, Herston, QLD, Australia
Alice Garrett
Division of Genetics and Epidemiology at the Institute of Cancer Research, London, United Kingdom
Cancer Genetics Unit at the Royal Marsden Hospital, London, United Kingdom
Paolo Gasparini
Medical Genetics Unit, Institute for Maternal and Child Health – IRCCS, Burlo Garofolo, Trieste, Italy
Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
Giorgia Girotto
Medical Genetics Unit, Institute for Maternal and Child Health – IRCCS, Burlo Garofolo, Trieste, Italy
Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
Anna González-Neira, Human Genotyping Unit–Spanish National Genotyping Centre(CEGEN), Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
Karen W. Gripp, Division of Medical Genetics, A. I. duPont Hospital for Children, Wilmington, DE, United States
Sara Gutiérrez-Enríquez, Hereditary Cancer Genetics Group, Vall d’Hebron Institute of Oncology (VHIO), Vall d’Hebron Barcelona Hospital Campus, Barcelona, Spain
Steven M. Harrison, Broad Institute of MIT and Harvard, Cambridge, MA, United States
Miguel de la Hoya, Molecular Oncology Laboratory, Oncology Department, Instituto de Investigación Sanitaria San Carlos, Hospital Clínico San Carlos, Madrid, Spain
Jodie Ingles
Cardio Genomics Program at Centenary Institute, The University of Sydney, Sydney, NSW, Australia
Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
Department of Cardiology, Royal Prince Alfred Hospital, Sydney, NSW, Australia
Renee Johnson, Victor Chang Cardiac Research Institute, Sydney, NSW, Australia
Jordan Lerner-Ellis
Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, Ontario, Canada
Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
Harvey Levy
Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, United States
Harvard Medical School, Boston, MA, United States
Conxi Lázaro
Molecular Diagnostic Laboratory, Hereditary Cancer Program, Institut Catalá d'Oncologia (ICO-IDIBELL-ONCOBELL-CIBERONC), Barcelona, Spain
Institut d'Investigació Biomèdica de Bellvitge, Barcelona, Spain
Heather Mason-Suares, Partners Healthcare, Laboratory for Molecular Medicine, Cambridge, MA, United States
Ana Margarida Medeiros
Cardiovascular Research Group, R&D Unit, Dept of Health Promotion and Prevention of Non-Communicable Diseases, National Institute of Health, Portugal
BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, Portugal
Jessica L. Mester, GeneDx, Gaithersburg, MD, United States
Alejandro Moles-Fernández, Hereditary Cancer Genetics Group, Vall d’Hebron Institute of Oncology (VHIO), Vall d’Hebron Barcelona Hospital Campus, Barcelona, Spain
Alvaro N.A. Monteiro, Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, United States
Anna Morgan, Medical Genetics Unit, Institute for Maternal and Child Health – IRCCS, Burlo Garofolo, Trieste, Italy
Thales C. Nepomuceno, Divisão de Pesquisa Clínica, Instituto Nacional de Câncer, Rio de Janeiro, Brazil
Rocío Núñez-Torres, Human Genotyping Unit–Spanish National Genotyping Centre(CEGEN), Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
Selen Özkan, Research Unit in Clinical and Translational Bioinformatics, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain
Natàlia Padilla, Research Unit in Clinical and Translational Bioinformatics, Vall d’Hebron Institute of Research (VHIR), Universitat Autònoma de Barcelona, Barcelona, Spain
Michael T. Parsons, Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
Tina F. Pesaran, Ambry Genetics, Aliso Viejo, CA, United States
Marta Pineda, Molecular Diagnostics Unit, Hereditary Cancer Program, Catalan Institute of Oncology (ICO), Institut d’Investigació Biomèdica de Bellvitge (IDIBELL), ONCOBELL Program, Barcelona, Spain
Paolo Radice, Unit of Molecular Bases of Genetic Risk and Genetic Testing, Department of Research Fondazione IRCCS Istituto Nazionale Dei Tumori, Milano, Italy
Farrah Rajabi
Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, United States
Harvard Medical School, Boston, MA, United States
Ebony Richardson, Cardio Genomics Program at Centenary Institute, The University of Sydney, Sydney, NSW, Australia
Peter Sabatini
Department of Clinical Laboratory Genetics, University Health Network, Toronto, Ontario, Canada
Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
Stephanie Sacharow
Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA, United States
Harvard Medical School, Boston, MA, United States
Amanda Spurdle, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
Bryony A. Thompson, Department of Pathology, Royal Melbourne Hospital, Department of Clinical Pathology, University of Melbourne, Parkville, VIC, Australia
Emma Tudini, Genetics & Computational Biology Department, QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
Clare Turnbull
Division of Genetics and Epidemiology at the Institute of Cancer Research, London, United Kingdom
Cancer Genetics Unit at the Royal Marsden Hospital, London, United Kingdom
Lisa M. Vincent, Division of Pathology & Laboratory Medicine, Children’s National Health System, Washington, DC, United States
Michael F. Walsh, Memorial Sloan Kettering Cancer Center, New York, NY, United States
Nicholas Watkins
Pathology and Lab Medicine, Mount Sinai Hospital, Toronto, Ontario, Canada
Hereditary Kidney Disease Clinic, Department of Nephrology, Princess Margaret Hospital, University Health Network
Molecular Genetics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
Foreword: the challenge of variant interpretation
This textbook represents a highly up-to-date resource for clinicians and molecular scientists on variant interpretation. The book will also be a great starting point for a broad audience from graduate and master students to the interested general public who want or need to learn more about interpretation of DNA variants and how they can be classified as disease associated or not. Until the last 5 years (and even occasionally now!) variants were often reported inaccurately as being pathogenic in published manuscripts based on fairly flimsy evidence from in silico analysis to splicing prediction tools that have a substantial inaccuracy rate. This is not a trivial matter. Wrongly classifying a variant as pathogenic or likely pathogenic can result in drastic action by those who use the test to predict disease. An apparent splicing variant in the breast/ovarian cancer predisposition gene BRCA1 was wrongly reported to be pathogenic based on it being at the (-2) position in the canonical splicing region. This led to women carrying the variant having risk-reducing mastectomies they did not require as carrying the variant did not put them at high risk. Although on the surface of it, variant interpretation can seem very complicated by having to use a range of different information sources the book gives a clear guide across human disease and particularly for inherited constitutional disorders. The importance of population frequencies made available by resources like gnomAD to case–control studies is a lesson for all. A comprehensive book like this one is a great resource. Ranging from how to use computational and functional evidence to interpretation of somatic data particularly for cancer predisposition it covers all the bases. It highlights the importance of clinical phenotypes and how a rare variant found in a number of families with the same rare monogenic syndromic disorder is a useful tool. Additionally, the importance of studying RNA to interpret potential splicing variants and the potential disruption of the resultant protein product. It even provides an important chapter on pharmacogenetics and personalized medicine an area that is vital in saving lives and preventing disease. Having built the framework for variant interpretation the book then provides chapters on how specific examples of variants can be classified in different disease areas. These range from heart disease to inborn errors of metabolism (phenylketonuria), hearing loss disorders, hypercholesterolemia and cancer predisposition, and finally finishing on the rasopathies. Overall, I would strongly recommend this book for those who have even slightly more than a passing acquaintance with DNA and how the variation can be important!
D. Gareth Evans ¹ , ² , ¹ Clinical Genetics Service, Manchester Centre for Genomic Medicine, Manchester University Hospitals NHS Foundation Trust, Manchester, United Kingdom, ² Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
About the editors
Conxi Lázaro PhD, Head of the Molecular Diagnostic Service, Hereditary Cancer Program.
Catalan Institute of Oncology. Program in Molecular Mechanisms and Experimental Therapy in Oncology (Oncobell), IDIBELL. Centro de Investigación Biomédica en Red de Cáncer (CIBERONC). Hospitalet de Llobregat, Barcelona, Spain; Institut d'Investigació Biomèdica de Bellvitge, Barcelona, Spain.
Dr. Lázaro is a molecular geneticist with more than 25 years of experience in the field of human genetics. She did her PhD in Human Genetics at University of Barcelona. She has worked in several clinical hospitals in Barcelona. She was an invited professor at Massachusetts General Hospital Cancer Center at Boston in 2003/04 and did a sabbatical stay at Mount Sinai Hospital and at Women's College Hospital in Toronto in 2018/19. In the last 10 years, she has been involved in several projects aimed at using Next-Generation Sequencing (NGS) for genetic testing purposes. Her field of expertise is Hereditary Cancer although she has worked on other genetic disorders. Of relevance was her pivotal research in the genetic basis of Neurofibromatosis type 1 (NF1) since the gene was discovered and her current work on the development of new therapeutic strategies for malignant tumors associated with NF1. She is member of several reputable international consortia and associations such as CIMBA, ENIGMA, CTF, and GENTURIS and had been member of the Scientific Program Committee of the ESHG as well as treasurer of the Spanish association of human genetics (AEGH).
Jordan Lerner-Ellis, PhD, FACMG, Associate Professor, Laboratory Medicine & Pathobiology, University of Toronto; Director & Head of Advanced Molecular Diagnostics, Pathology & Laboratory Medicine, Mount Sinai Hospital, Sinai Health, Toronto, Ontario, Canada.
Dr. Jordan Lerner-Ellis has 20 years of experience in molecular genetics and diagnostics. He is Director & Head of Advanced Molecular Diagnostics in the department of Pathology and Laboratory Medicine at Toronto's Mount Sinai Hospital, Sinai Health System; Associate Professor at the University of Toronto, Laboratory Medicine & Pathobiology; and Clinician Scientist at the Lunenfeld-Tanenbaum Research Institute. His laboratory provides clinical diagnostic services for hereditary breast, ovarian, and colon cancer, and other genetic testing areas, for Toronto and the province of Ontario. Dr. Lerner-Ellis completed his PhD in human genetics at McGill University. He continued his studies at the Children's Hospital in Basel, Switzerland, before moving on to a postdoctoral fellowship in Molecular Biology at Harvard University, the Massachusetts General Hospital, and in Medical and Population Genetics at the Broad Institute. Following his postdoctoral studies, Dr. Lerner-Ellis completed the Clinical Molecular Genetics training program at Harvard Medical School, Brigham and Women's Hospital and is certified as a diplomate of the American Board of Medical Genetics. Dr. Lerner-Ellis’ core interest is in molecular diagnostics as currently applied to breast and colon cancer. His research is focused on improving genetic testing through greater reliance on new sequencing technologies. A concurrent aim of his research is to integrate genome sequencing into the general practice of medicine. Dr Lerner-Ellis is active in national and international data sharing, and variant interpretation efforts aimed at improving our understanding of the relationship between DNA variants and disease.
Amanda Spurdle, PhD, Associate Professor and Group Leader, Molecular Cancer Epidemiology, QIMR Berghofer Medical Research Institute, Brisbane, Australia.
Dr. Spurdle has more than 20 years of experience in the field of molecular genetic epidemiology of hormone-related cancers. She developed the first model to classify variants in the colorectal–endometrial cancer mismatch repair genes, and led the effort by the InSiGHT consortium (International Society for Gastrointestinal Hereditary Tumours) to standardize the clinical interpretation of mismatch repair gene variants in the InSiGHT database. She co-founded and now leads the ENIGMA international consortium (Evidence-based Network for Interpretation of Germline Mutant Alleles), which aims to develop statistical and laboratory methods to evaluate variants of uncertain clinical significance in known and suspected breast cancer predisposition genes, and she is recognized by the ClinGen consortium as an expert panel for BRCA1/2 variant classification for ClinVar. She coordinates the variant interpretation activities of the BRCA Challenge project initiated by the Global alliance for Genomics & Health. She also contributes to activities of multiple ClinGen Variant Curation Expert Panels focused on hereditary cancer.
Chapter 1: Introduction
the challenge of genomic DNA interpretation
Jordan Lerner-Ellis ¹ , ² , ³ , Amanda Spurdle ⁴ , and Conxi Lázaro ⁵ , ⁶ ¹ Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada ² Pathology and Laboratory Medicine, Mount Sinai Hospital, Toronto, Ontario, Canada ³ Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada ⁴ QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia ⁵ Molecular Diagnostic Laboratory, Hereditary Cancer Program, Institut Catalá d’Oncologia (ICO-IDIBELL-ONCOBELL-CIBERONC), Barcelona, Spain ⁶ Institut d'Investigació Biomèdica de Bellvitge, Barcelona, Spain
Abstract
We are currently living in very exciting times in the field of human genetics and variant interpretation is one of the most important issues to be able to achieve the ideal of personalized or precision medicine.
The use of NGS technology in different fields of human health and disease is generating an enormous amount of information regarding genetic and genomic variants. Determining the role of these variants is not always obvious, especially when variants are identified in genes where the mechanism of action is not well defined, nor the phenotype associated with a mutation in the gene.
Work in multidisciplinary teams and development of multifactorial algorithms is paramount to integrate different sources of information that will eventually speed up the process of variant classification. Artificial intelligence and machine learning approaches are starting to play a role in this process.
Therefore, it is an excellent moment to compile variant interpretation practices and approaches in a single book covering foundational aspects, modes of analysis, technology, disease and disorder specific case studies, and clinical integration.
Keywords
Clinical genetics & genomics; Consensus guidelines; Databases; In-silico approaches; Next generation sequencing; Pharmacogenomics; Splicing; Variant interpretation
Medical genetics is a field that has rapidly evolved in the last three decades and includes multiple subspecialties such as clinical genetics, genetic counseling, molecular genetics, cytogenetics, and biochemical genetics. All of these are being applied across multiple different disease specialty areas.
The human genome project was completed in 2003 and new technologies can now sequence the entire genome for less than $1000; this has ushered in a new age of genetic testing that includes panels and exome and genome sequencing. Both common and rare variation exist. Unfortunately, the speed at which we can now sequence the human genome has outpaced our ability to interpret it—this is the big challenge for the future.
The aim of the book is to provide a comprehensive theoretical and practical understanding into how DNA variants are interpreted and classified for clinical applications, with a focus on germline or constitutive disease. It is designed for a broad audience from graduate and master students to clinicians, investigators, or industry employees who wish to learn how to carry out variant interpretation, including what approaches are used today and how they are applied. Practical examples are provided from experts in the field to outline considerations for specific disease areas and clinical scenarios for learning.
The Human Genome Project estimated that humans have around 21,000 protein-coding genes. New gene–disease associations continue to be discovered, and as of August 28, 2020, the Online Mendelian Inheritance in Man (OMIM), a comprehensive compendium of human genes and genetic phenotypes, registered a total of 4316 genes with phenotype-causing gene alternations, and these tend to be the ones that are included in diagnostic panels or clinical exomes. However, with close to 3 billion base pairs per genome and over 5 million common and rare DNA variants per individual, understanding how this variation contributes to human phenotypes is still a colossal enterprise.
Different modes of inheritance have been described. They include dominant or recessive inheritance, X-linked dominant or recessive, or mitochondrial. However, these patterns of transmission can vary by disorder due to other molecular mechanisms such as imprinting (an epigenetic phenomenon that causes genes to be expressed in a parent-of-origin-specific manner) or anticipation (increasing severity of disease from generation to generation, in repeat expansion disorders). A one-to-one correlation to disease does not always exist and some genes can be implicated in multiple different diseases, with variability even within the same family. For example, variants in the LMNA gene cause over 13 different and distinct phenotypes. While much of human genetic variation is thought to be benign, of the known variation that contributes to disease-related phenotypes, disease risk effect sizes can vary from low, moderate, to high. Disease penetrance or risk of getting a disease, often measured by way of relative risk (or the risk in relation to the population baseline risk) using cohort studies (for rare diseases), is one way in which geneticists have categorized disease severity. Notably, penetrance is also closely tied to age-related risk, which can be variable. For common complex diseases such as type 2 diabetes, large case–control studies called genome-wide association studies (GWAS) have been employed to look at individuals for disease outcome in relation to the presence of a common genetic variant. Thousands of risk-associated variants have been documented (GWAS catalog: https://www.ebi.ac.uk/gwas/). In recent years, combining disease-associated variation identified from across the genome has led to the development of polygenic risk scores. However, establishing which genes and which variants to test for in order to make clinical decisions based on clinical utility and health economics is a continuing area of research. For instance, if a pathogenic variant is identified—Is there a treatment available, can the disease be managed differently to improve patient outcome, and are there facilities available where such treatment can be obtained? Is the disease of sufficient severity or frequency to warrant a clinical test? Will the result inform family planning? Interpreting genetic information and establishing how best to apply genetic testing in practice remains a formidable challenge.
As mentioned above, the ultimate goal of variant assessment is to provide an interpretation of the clinical significance of a variant that results in clear and accurate reporting to the requesting physician. This work must be done with thoroughness, as the interpretation will guide decisions and will often determine patient management strategies such as with surgical, chemotherapeutic, or other treatment decisions and will be also used for patients and their relatives in family planning decisions. The general information needed to properly interpret a variant may include the following: the type of change and location; a summary of the literature and if the variant has been previously observed with associated phenotypical information; database(s) where the variant is identified and if previously detected by the lab; description of relevant data; number of carrier probands (out of how many tested); presence or absence in healthy control datasets; population frequencies; segregation; co-occurrence with pathogenic variants in the same or other disease-relevant genes; nature of predicted molecular change and consequences; conservation and in silico analyses; functional data if available; conflicting information and reconciliation if possible; and resulting classification. The geneticist carrying out an analysis may add summary sentences stating major reasons for the variant classification. Such assertions must be reconciled with any applicable patient phenotypes, and additional supporting evidence may be added if variants in a gene have not been previously reported. The job of the medical geneticist is to subsequently take all the information at hand and make clinical correlations, follow up testing if appropriate, and follow up treatment plans and referrals to other specialists.
The typical decision trees for classifying variants are extremely complex and some rules do not always apply; for example, silent variants can be pathogenic, and predicted loss-of-function variants can be benign. The understanding of variants in the context of phenotypic consequences adds further complexity, and rules may also change to reflect varying degrees of heterogeneity or penetrance. Variants identified during testing for dominant hereditary cancers may be treated or classified differently than those discovered during tumor testing or testing of a recessive condition. Until recently, much of this work was carried out using data and expertise housed within individual laboratories. However, as laboratories expand their testing menus to include more extensive panels and exome and genome sequencing, variants associated with phenotypes that lie outside areas of disease expertise will be detected with regularity, a situation which has created a need to share data more broadly.
Long-standing differences in how laboratories, researchers, and clinicians have described DNA, RNA, and protein products have led to challenges in communicating or sharing genetic findings more broadly. Thus, the international community has settled on terminology or nomenclature to allow for a common language
for describing DNA variation developed by the Human Genome Variation Society (HGVS). Describing variants in terms of both the location in the DNA and the type of variation that occurs as well as how variants are classified in relation to disease has allowed for a better understanding of the DNA code. This is the subject of Chapter 2, relating to terminology and standards.
Consensus guidelines from the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) have been instrumental in creating a framework, now used in multiple countries around the world, for making genomic data more applicable in the identification, treatment, and management of disease (see Chapter 3). For instance, if a variant is identified in the context of diagnostic evaluation in an individual affected with disease, or during screening where an individual is asymptomatic and could be a carrier. Understanding how information is reported or interpreted in the context of case-level evidence is an important consideration when determining what a particular genetic variant might mean for any given individual. The ACMG/AMP guidelines reduced clinical variant assessment into qualitative distinct evidence, e.g., functional, population, or in silico, and stratified the evidence into categories and then combined criteria to come up with a semiquantitative categorical variant assessment. However, the ACMG/AMP guidelines have been shown to fit into a Bayesian framework which provides a mathematical foundation for criteria that is largely based on qualitative criteria and this quantitative framework may help to automate variant pathogenicity assessments. This subject is described in Chapter 3 and discussed also in Chapter 4, which describes true quantitative models which have been developed for specific gene–disease associations. Unlike semiquantitative methods, different evidence types are calibrated against known reference sets of pathogenic variant carriers and noncarrier or benign variant controls. This method has the advantage that it can be tailored to incorporate any evidence type considered relevant for that gene, and analyses of large datasets can be individualized to account for dataset-specific differences, such as in ascertainment criteria for testing.
One commonly used evidence type for variant interpretation is population frequency, a topic covered in Chapters 4 and 5. Large-scale sequencing projects in the last decade have led to a much more detailed view of common variation within human populations. Although the populations represented in these databases still represent a small spattering of individuals from around the world with lack of representation from many different geographical groups, this approach has allowed us to more quickly use the frequency of DNA variants to help make correlations with disease and to help classify variation as benign or pathogenic.
In silico approaches to interpreting variants tend to use existing information about variants such as biophysical properties of the protein including what is known about the crystal structure of the protein, the change in amino acid and accompanying differences in the properties of these substitutions and if they are predicted to result in conformational changes, whether or not variants occur within conserved or functional domains of the protein. These approaches take into consideration multiple physical lines of evidence. In silico approaches have provided much value in predicting whether or not variants can have a deleterious effect on a protein product. Some limitations exist in that multiple in silico approaches do not always agree with each other and rely on algorithms which may not always truly replicate the environmental context. Bioinformatic prediction tools are also a key factor in delineating variants with potential to alter mRNA splicing profiles. From both the protein and mRNA prediction standpoints, in silico approaches play a key role in providing supportive evidence when classifying variants. This is the subject of Chapter 6 . Another important role for such predictions is their use in prioritizing variants for downstream laboratory assays of mRNA and protein function (see below).
Segregation or association studies are also important approaches used in variant interpretation. However, the majority of sequence variants causing highly penetrant severe disorders are rare and limit the ability to study these variants through these approaches. For this reason, there has been much emphasis placed on developing, and assessing the validity and utility of, assays of variant impact on mRNA transcripts (Chapter 7) and protein function (Chapter 8). In the cancer predisposition field, the use of tumor data (somatic mutations or somatic signatures) can be used when assessing the pathogenicity of germline variants (Chapter 9).
Pharmacogenomic testing is another area of genetic testing that is quickly evolving. The use of pharmacogenomic variant data to determine the metabolism of certain drugs and dose response is still not in routine clinical use but there are a number of well-studied examples of the application of this information in clinical practice. An array of databases are now available and professional organizations have come together to come up with guidelines and recommendations for applying this information clinically. Some genes such as CYP2D6 have been implicated in the efficacy or toxicity of many drugs. In this example, this gene includes practice guidelines on at least 15 widely adopted medications from codeine to tamoxifen, tricyclic antidepressants and serotonin reuptake inhibitors. Translating this information into metabolizer phenotype or status based on genotype is critical to enable consistent clinical implementation. Additional complexity exists because multiple genes may interact in the metabolism of a given medication and so the decision trees may vary. The PharmGKB resource that has incorporated guidelines from the Clinical Pharmacogenetics Implementation Consortium (CPIC) and/or Dutch Pharmacogenetics Working Group (DPWG) continues to provide guidance on defining genotype–phenotype correlations. Current systems to translate this information rely on the star (∗) allele nomenclature system which defines haplotypes or variants that define a particular allele and the resulting phenotype of that allele. The topic of Pharmacogenomics and Personalized Medicine is covered in Chapter 10 .
Many individual international databases offer a rich store of data on DNA variants. However, most of these resources have been of limited utility for clinical laboratories due to a number of serious shortcomings, including a lack of clinically related information on phenotypic consequences. Moreover, many public-access genetic databases are limited in their scope (e.g., limited to locus-specific data), lack clinically approved interpretations, and/or are hampered by clinical and technical false positives and negatives. The Clinical Genome Resource (ClinGen) is one effort aimed at sharing and evaluating genomic variants and disease associations (http://clinicalgenome.org; http://www.nih.gov/news/health/sep2013/nhgri-25.htm). The ClinGen project includes the ClinVar database (http://www.ncbi.nlm.nih.gov/clinvar), operated by the National Center for Biotechnology Information as their depository of record. To date, over 1.1 million records with interpretations and 842,050 unique variation records have been deposited into ClinVar by 1658 submitters (August 10, 2020). Other organizations such as the Human Variome Project (HVP) are focused on bringing together local
variant databases by creating standards and guidelines for genetic interpretations globally. The HVP puts emphasis on the standardization of genetic variant interpretation across different laboratories, but at the international level. Many national projects have taken a grassroots approach by first standardizing variant information at the national level, in order to better facilitate the entry of genetic information into the international community (e.g., the Canadian Open Genetics Repository). The overall goal is to amass existing information spread across multiple sources and combine them in a single common resource or centralized repository so that all the scientific and clinical information contained therein can be shared with all potential users. Challenges include lack of a standardized variant classification system and differences in clinical reporting protocols. By pooling variant information currently stored in individual clinical laboratories, the interpretation of human genetic variants can be made more clinically useful. The topic of data sharing and databases is discussed in Chapter 11.
Lastly, it is important to understand variant interpretation in the context of different disease–gene relationships and how underlying genetic variation presents clinically. The expertise and experience of the molecular geneticist is crucial but it should be merged with the clinical phenotype provided by clinical geneticists in a more holistic approach (Chapters 12 and 13).
The manifestation of genetic disorders is specific to certain genes and diseases, and for this reason, the last part of the book is composed of six chapters dealing with examples in different genetic conditions such as Hereditary Cancer, Inherited Heart Diseases, Phenylketonuria, Hearing loss, Familial hypercholesterolemia, and RASopathies. These examples illustrate the complexity of variant assessment in different disease contexts and the importance of multidisciplinary approaches and teams to better achieve the most accurate variant classification for clinical use (Chapters 14 to 19).
As indicated above, the role of the clinical molecular geneticist or cytogeneticist is to interpret the clinical significance of DNA variation and to communicate this information in clear language to the physician or patient in order to provide guidance in diagnosing or managing a particular disease. The tools in our armamentarium are many and continue to expand with new approaches. Applying these tools to interpret the clinical significance of DNA variation and inform medical decisions is the subject of this book.
Section I
Theoretical chapters
Outline
Chapter 2. General considerations: terminology and standards
Chapter 3. International consensus guidelines for constitutional sequence variant interpretation
Chapter 4. Quantitative modeling: multifactorial integration of data
Chapter 5. Clinical and genetic evidence and population evidence
Chapter 6. The computational approach to variant interpretation: principles, results, and applicability
Chapter 7. Functional evidence (I) transcripts and RNA-splicing outline
Chapter 8. Functional evidence (II) protein and enzyme function
Chapter 9. Somatic data usage for classification of germ line variants
Chapter 10. Pharmacogenetics and personalized medicine
Chapter 11. Data sharing and gene variant databases
Chapter 12. Approaches to the comprehensive interpretation of genome-scale sequencing
Chapter 13. Phenotype evaluation and clinical context: application of case-level data in genomic variant interpretation
Chapter 2: General considerations
terminology and standards
Ivo F.A.C. Fokkema ¹ , and Johan T. den Dunnen ¹ , ² ¹ Department of Human Genetics, Leiden University Medical Center, Leiden, South Holland, the Netherlands ² Department of Clinical Genetics, Leiden University Medical Center, Leiden, South Holland, the Netherlands
Abstract
The human genome, over 3 billion nucleotides in size, collects changes over time that become part of the naturally existing variation in the population. In the case of a genetic disorder, of all of these variants usually only one or two are associated with the presented condition. Finding these variants in such large datasets is truly like searching for a needle in a haystack. This chapter describes the types of genetic variation and their possible consequences as well as various standards and their importance related to describing, interpreting, and reporting genetic variants and phenotypes. Some important points to consider when classifying variants are discussed, as well as general challenges and considerations to keep in mind when performing sequencing analysis.
Keywords
DNA variants; Genetic variation; NGS; Variant classification; Variant consequences; Variant nomenclature
Introduction
The human genome, the collection of our full DNA sequence, is often referred to as the book of life.
Decades of research have brought us releases of the human genome complete enough to allow for various applications, including clinical diagnoses. The latest of such releases, the Genome Reference Consortium’s GRCh38 reference genome build, was completed in December 2013 and consists of over 3 billion letters (nucleotides). In our body, all DNA is present in two copies: one maternal and one paternal copy. Every time a cell in our body divides, both copies need to be duplicated (replicated), a process that is very precise but not without errors. These changes will be copied, passed on to the next generation, and over time the human DNA sequence slowly changes. The speed at which this happens, the mutation rate, is estimated to be around 1.5 nucleotides per year [1] with estimates ranging between 36 and 63 changes (variants) passed on to the next generation [2,3]. Most of these variants do not cause disease but become part of the naturally existing variation in the population.
Nowadays we are able to determine the sequence of a human individual within a few days. Since, especially on a global scale, the natural variation in the human DNA is high, comparing a person’s DNA to a standard reference sequence is not without problems. Compared to the reference, an average human genome contains about 4 million variants, while an average exome analysis (i.e., analysis of all protein coding sequences) returns some 40,000 variants. In the case of a genetic disorder, of these variants usually only one or two are associated with the presented condition. Finding these variants in such large datasets is truly like searching for a needle in a haystack. Being able to see the difference between the vast majority of benign variants and the few disease-causing (pathogenic) variants requires a good understanding of the different types of variants and the possible consequences these variants have on the function of the genes they affect.
This chapter describes the types of genetic variation and their possible consequences as well as various standards and their importance related to describing, interpreting, and reporting genetic variants and phenotypes. Some important points to consider when classifying variants are discussed, as well as general challenges and considerations to keep in mind when performing sequencing analysis.
Genetic variation
Types of DNA sequence changes
DNA variants can be characterized by the type of variation that occurs on the DNA level as well as their consequences on either RNA or protein level. To prevent those consequences from getting mixed, it is best to strictly separate and report each level individually (DNA, RNA, and protein). As variant screening is mostly based on DNA analysis, variants detected are primarily described on the DNA level. In addition, the (predicted) consequences on the RNA and protein level can be given.
In general, current short-read high-throughput sequencing technologies cannot easily detect all different DNA variant types. To detect all variant types, either special analysis pipelines are required or long-read sequencing technologies need to be applied. The basic DNA sequence variant types identified are listed in Table 2.1.
• Substitutions are variants where one single DNA nucleotide is replaced by another single DNA nucleotide. This is by far the most common type of DNA sequence variant, taking up some ∼80% of all reported DNA variation.
• Deletions are variants where one or more nucleotides have been removed from the original DNA sequence. This is the next most common variant type. When a deletion spans one or more exons of a gene or more than 1000 nucleotides, it is referred to as a copy number variant (CNV).
• Insertions are the reverse of deletions and occur when one or more nucleotides are added to the original sequence. When the inserted sequence is a tandem copy of the original DNA sequence, it is called a duplication. Both duplications and deletions frequently occur where the DNA contains repeated copies of a small sequence. When a duplication spans one or more exons of a gene or more than 1000 nucleotides, it is referred to as a CNV.
Table 2.1
• Deletion–insertions are a combination of a deletion and an insertion in the same location in the DNA (excluding substitutions). One or more nucleotides are replaced by one or more other nucleotides.
• Inversions are variants where a stretch of DNA turns around (inverts); the inserted sequence is the exact reverse complement of the deleted sequence. Inversions have a minimum length of two nucleotides; one-nucleotide inversions are classified as simple substitutions.
• Structural variation is a term for various large chromosomal changes such as translocations and transpositions. Note that these are usually not picked up by short-read sequencing methods and require additional tests to be detected. If the structural changes are large enough, they can be seen using optical mapping technologies or microscopy