DNA Sequencing UE Gestion de l’innovation Teacher: Christophe Lécuyer Anya Rahmoune L3 Sciences de la Vie-Gestion 2019-2020 SUMMARY Introduction II – History of DNA sequencing III – Pharmacogenomics IV – Direct-to-consumer genetic testing: case study of 23andme IV – Genetic sequencing, ethics and society Conclusion Bibliography Appendixes Introduction Since its discovery, DNA has fascinated scientist and general public alike. Indeed, a large part of a person’s information can be found in its DNA: their biological sex, their ancestry, their physical appearance, sometimes even their medical condition. Genetic makeup is part and parcel of who we are. It’s why DNA sequencing was crucial to the pursuit of research in Life Sciences. Some scientists go as far as to say DNA sequencing technology is such an important innovation, it is akin to that of the microscope. Subsequently, this technology was embraced by many industries in the Life Sciences field: medicine, virology, evolutionary biology, forensics, etc. and even spawned new ones, two of which we will study in this essay: pharmacogenomics and direct-toconsumer genetic testing. The question is, how did DNA sequencing technology came to be and how did it become essential and ubiquitous in the Life Sciences industry, adopted as it is by every laboratory and company in this field? To understand how this came to be, we will start with a retelling of the discovery of genetic testing technology. I – History of DNA sequencing DNA sequencing as an innovation was very much incremental. Several milestones had to be reached to get to the technology we have today. Listing each one would be too long, and it is not the point of this essay, so we’ll only name the most important ones. An important figure in DNA sequencing is Frederick Sanger, a British biochemist who received two Nobel Prizes: the first for “his work on the structure of proteins, especially that of insulin”, the second, he shared with Walter Gilbert for “their contributions concerning the determination of base sequences in nucleic acids” – that is to say, DNA sequencing. Sanger did not discover DNA sequencing methods immediately, he first had to sequence a protein, insulin, then RNA, and finally DNA. Sequencing of Insulin Frederick Sanger obtained his Ph.D in 1943 in Cambridge, working on the metabolism of the amino acid lysine in the animal body. The same year, he got his first job, a position in the Biochemistry department of Cambridge offered by A.C. Chibnall. Chibnall was interested in insulin, of which he determined, with colleagues, the amino acid composition of its bovine form. The structure of insulin, that is to say, the exact order of these amino acids, was however still a mystery to biologists. Chibnall put Sanger up to the task of investigating the end groups of the protein, which would later lead to research on its structure. Sanger tried several techniques and molecules before he managed, thanks to his connections, to get his hands on dinitrofluorobenzene (DNFB), synthesised at the time for the war effort. It worked, and he published his first paper. It was not the discovery of the entire sequence yet, but it was a start. Sanger later found insulin was made of two chains, that he called chain A et chain B. With Hans Tuppy and later Ted Thompson, Austrian and Australian biochemists, he gradually pieced together the complete structure of insulin from overlapping sequences. The only thing missing was the emplacement of the disulfide bonds1 that joined the two chains. Using the new technique of paper electrophoresis2, Sanger located the disulfide bridges, thus finally completing the structure of insulin. Sanger therefore proved proteins are well-ordered molecules and that, by analogy, genes must be, too. For his work, Sanger received his first Nobel Prize in Chemistry in 1958. Sequencing of RNA At the time, nucleic acids, that is to say, DNA and RNA, were still a mystery to scientists. Sanger only got interested in RNA because DNA remained too difficult to sequence yet. We know now DNA and RNA are made of pieces called nucleotides, of which there are five types: A (adenine), C (cytosine), G (guanine), T (thymine) and U (Uracil). We know DNA is exclusively composed of A, C, G, and T, and mRNA3 can be a perfect mirror to DNA, with the only difference being every T is replaced by a U in the sequence. Today, we also know, for every gene of the human genome, the exact order of these nucleotides. 1 Disulfide bonds : a disulfide refers to a functional group with the structure R−S−S−R′. The linkage is also called an SS-bond or sometimes a disulfide bridge. Source : Wikipedia. 2 Paper electrophoresis : technique used to separate small charged molecules such as amino acids and small proteins. 3 mRNA : single-stranded RNA molecule that corresponds to the genetic sequence of a gene and is read by the ribosome in the process of producing a protein. Source : Wikipedia. However, at the time, DNA molecules were too big and complicated for scientists to sequence, with no known technique to identify the structure. With RNA, on the other hand, it was more manageable. Even if the simplest RNA viruses contained a few thousand nucleotides, still too big to sequence, tRNAs4, a different and newly-discovered type of RNAs, contained fewer than a hundred. The appeal was obvious. Sanger’s goal was to not only work out sequence methods for RNA and be the first to sequence one, but also to reveal the genetic code. Indeed, since mRNA reflects the DNA structure, sequencing a mRNA corresponding to a protein whose sequence is known would reveal the genetic code. However, Robert Holley from Cornell University published first his sequence analysis of the 77 ribonucleotides of alanine tRNA from a species of yeast in 1965 and so beat him to it. By 1967, Sanger's group had determined the nucleotide sequence of a small RNA of 120 nucleotides from bacteria E. coli. Sequencing of DNA DNA sequencing was Sanger’s ultimate goal, the consecration of his career. The problem, at the time, was the unavailability of small-sized DNA or ways to break long DNA into smaller, more manageable pieces. Sanger chose to work on bacteriophage ØX174 DNA because of its convenience: the phage was singlestranded and his colleague John Sedat knew how to grow it. He wanted to use DNA polymerase5 to develop his DNA sequencing method, but DNA polymerase requires a primer6, and such a primer would have to be synthesized. Thanks to Hans Kössel, a German molecular biologist who worked alongside Khorana, an IndianAmerican biochemist who pioneered DNA synthesis methods at MIT, Sanger had access to octanucleotide primers. So he used an octanucleotide primer and E. coli DNA polymerase. In the early 1970s, Sanger hired a new technician, Alan Coulson, 20 year old. Because of their shared reserved nature, a sharp contrast to the outgoing personality of other members of Sanger’s team, Coulson and Sanger get on marvellously and worked well together. Sanger and Coulson went on to invent what they called “the plus and minus method” of DNA sequencing. By 1977, they had published the whole 5400 base pair sequence of bacteriophage ØX174. They also discovered the number of proteins produced by ØX174 to be much higher than the number of its genes – which meant that the one gene-one enzyme hypothesis of Beadle and Tatum was false. Gene sequences overlapped, they just happened to start at different points in the DNA sequence. This method outperformed other techniques, but it was still too laborious. Inspired by Fred Arthur Kornberg’s work posing dideoxythymidine triphosphate (ddTTP) as a substrate for DNA polymerase, Sanger developed another method, the dideoxy chain-termination method, or what is known today as the Sanger method. In parallel, Maxam and Gilbert were working on another method, perfected around the same time Sanger invented his. However, the Sanger method used fewer toxic chemicals, lower amounts of radioactivity and was easier and reliable, so it became the favoured method from the 1980s to the 2000s7. In 1980, Sanger received his second Nobel Prize in Chemistry, which he shared with Gilbert, for “their contributions concerning the determination of base sequences in nucleic acids”. Sanger’s method is only the first generation of DNA sequencing method. After that come the second and third generations, quicker, more precise and more reliable, each time needing less genetic material to work with. 4 tRNA : necessary component of translation from mRNA to proteins. DNA polymerase : enzymatic complex intervening in DNA replication during cell multiplication. 6 Primer : short single-stranded nucleic acid utilized by all living organisms in the initiation of DNA synthesis. Source : Wikipedia 7 Source : DNA sequencing, Wikipedia. 5 8 8 Shendure, Jay & Balasubramanian, Shankar & Church, George & Gilbert, Walter & Rogers, Jane & Schloss, Jeffery & Waterston, Robert. (2017). DNA sequencing at 40: Past, present and future. Nature. 550. 10.1038/nature24286. II – Pharmacogenomics We can raise two relevant definitions of pharmacogenomics: The scientific paper authored by Theodora Katsila and George P. Patrinos, defines it as follows: “Pharmacogenomics aims to develop strategies for individualizing therapy to optimize drug efficacy and minimize toxicity on the basis of our improved understanding of how genomic variants influence drug response.”9 According to the European Ubiquitous Pharmacogenomics program, “Pharmacogenomics is the study of genetic variability affecting an individual’s response to a drug. Clinical application of pharmacogenomics knowledge will result in less ‘trial and error’ prescribing and more efficacious, safer and cost-effective drug therapy.”10 Those are both good definitions. Nonetheless, to be more accurate, we need to explain yet another word: pharmacogenetics. These two words, pharmacogenomics and pharmacogenetics, co-exist and are, while slightly different, more or less interchangeable when it comes to designate what is essentially the same field. Pharmacogenetics is the “study of variability in drug responses due to heredity”11, that is to say, how one’s genes affect one’s response to a drug. Pharmacogenomics, on the other hand, is a broader term: the whole genome is studied in relation to drug response – indeed, genes compose barely 2% of the genome, the rest is noncoding DNA, mitonchondrial or chloroplast DNA. An estimated 70% of response variability to drugs has a genetic origin, related to absorption, distribution, metabolism, action mechanism, biological effects, underlying pathological state, etc. Such a high number can only encourage scientists to look into it and the pharmacogenomics field. To understand precisely what DNA sequencing technology’s role is in individualized therapy, we can use the examples of the FSH receptor, asthma and cancer therapy. The FSH, Follicle Stimulating Hormone, is a multifunction hormone targeting the gonads, reproductive glands (ovaries and testicles). In the female body, the FSH receptor is situated at the surface of the ovaries. Several variants of the FSH receptor have been identified. The genotype of the receptor determines the ovarian response to FSH. The allelic combination serin/serin is associated with higher levels of FSH, which can be explained by lower sensitivity to FSH: the body compensates a low sensitivity to the hormone with higher levels of it. On the opposite, the mutated asparagin/asparagin receptor is related to higher sensitivity and is then associated with ovarian hyperstimulation. Thus, depending on her genome, a patient can either be exposed to a wrong response (abnormally high levels of FSH) or to hyperstimulation. In consequence, if the patient ever needs medication for hormone-related issues, knowing her genome helps determining the most suitable course of action to treat her, the best medication with the best dosage, increasing chances of success and reducing drug-related risks in case of medication intake. For asthma, we know there is a variability in the patient’s response to the 3 categories of treatments: Beta2-adrenergic receptor agonists, antileukotrienes and inhaled corticoids. For every therapeutic class, research shows the existence of polymorphisms in patients’ genes, such as β2-receptor, the 5-lipoxygenase gene promoter or the corticotropin releasing factor receptor, are associated with different responses to the different treatments. In the case of cancer, researchers are looking into the genetic markers for the different types of cancer, but also for a way to diminish the toxicity of chemotherapy who is, by nature, not an individualised treatment, and thus has very harmful side effects. 9 Source : Whole genome sequencing in pharmacogenomics. Source : The Ubiquitous Pharmacogenomics program website. 11 Source : Pharmacogenetics and pharmacogenomics. 10 III – Direct-to-consumer genetic testing: case study of 23andme 23andme is an American biotechnology company created in April 2006. It proposes direct-to-consumer genetic testing with two offers: “Ancestry Personal Genetic Testing” and “Health + Ancestry Service”. The former determines the client’s genetic ancestry and the latter uses known genetic markers to uncover which diseases the client risk developing in future. As of now, the company mainly operates in the United States of America, in part because the Health Service, due to applicable regulations, is allowed only in the USA, Canada and the United Kingdom, and not everywhere internationally. The price of DTC genetic testing started at 999 $ in 2007, fell down to 399 $ in 2008 to finally attain the more affordable price it is now, 99 $. To achieve such affordability, the product was sold at a loss to build valuable consumer database for more accurate results in the future. Today, in the US, for 99 $ and a shipping fee, anyone can obtain a genetic analyse of their ancestry. After ordering, a kit arrives home with the tools and one only has to provide a sample of their saliva and mail it back to the lab. Results are accessible online after 3 to 5 weeks. Of course, the Ancestry test also allows clients to find their biological siblings or egg or sperm donor, if they were conceived this way. Title: 23andme’s genetic testing tool kit In 2013, the Personal Genome Service Genetic Health Risk service had to be withdrawn from the market due to lack of approval before it was finally authorized the Food and Drugs Administration in 2015 for detecting the Bloom Syndrome, a “rare autosomal recessive disorder characterized by short stature, predisposition to the development of cancer, and genomic instability.”12 12 Source : Wikipedia. In 2017, 23andme was allowed to test for 10 diseases and conditions, described as follow by the FDA13: - Parkinson’s disease, a nervous system disorder impacting movement; - Late-onset Alzheimer’s disease, a progressive brain disorder that destroys memory and thinking skills; - Celiac disease, a disorder resulting in the inability to digest gluten; - Alpha-1 antitrypsin deficiency, a disorder that raises the risk of lung and liver disease; - Early-onset primary dystonia, a movement disorder involving involuntary muscle contractions and other uncontrolled movements; - Factor XI deficiency, a blood clotting disorder; - Gaucher disease type 1, an organ and tissue disorder; - Glucose-6-Phosphate Dehydrogenase deficiency, also known as G6PD, a red blood cell condition; - Hereditary hemochromatosis, an iron overload disorder; and - Hereditary thrombophilia, a blood clot disorder. The Health service is supposed to help clients make informed decisions about their lifestyle or start a discussion with a health professional; however, the FDA warns: the presence or absence of genetic marker for these diseases does not mean the client will or won’t ultimately develop a condition. Other factors, environmental and lifestyle-related, are also determining and not to be neglected. A major shift of the access point for genetic testing from physicians to the internet might occur in the next decades, which would not only facilitate pharmacogenomics and thus, improve targeting of drugs, but also have economic effects on public health care. Overall, DTC genetic testing might lower health care costs by encouraging healthier behaviours and prevention strategies with an increase in early detection and intervention. However, DTC genetic testing could also raise the cost if “it results in more tests, follow-on screenings and interventions, or if negative test results give a false sense of security and either leads to under-use of regular check-ups and other preventive measures, or is viewed as permission to resume, continue or commence smoking, poor diet, or a sedentary lifestyle—all behaviours that contribute to a multitude of chronic, multibillion-dollar-a-year diseases. In light of the potential for overuse of DTC genetic testing, it will be important to determine under what circumstances genetic testing is appropriate and beneficial.”14 Another societal aspect to consider is that physicians are, for the most part, not prepared or equipped to interpret whole genome analysis results. The lack of knowledge and expertise may result in mistrust in one’s physician, seeing him or her as outdated and behind the times. This is due to medical schools not including advanced genetics and genomics in their curriculum. In a survey15 done in Canadian and American schools, it turned out less than half incorporated genetics classes in 3rd and 4th year, when clinical practice starts, and most of the genetics classes focused on the theoretical aspect and not the practical applications. It will take decades before producing enough physicians who are as comfortable with genetic results reading as they are with other aspects of their work. 13 Source : FDA allows marketing of first direct-to-consumer tests that provide genetic risk information for certain conditions - Food and Drug Administration, 6 avril 2017. 14 Source : Committee on Science, Technology, and Law, Forum on Drug Discovery, Roundtable on Translating Genomic-Based Research for Health, National Research Council, Institute of Medicine - Direct-to-Consumer Genetic Testing: Summary of a Workshop. 15 Thurston V, Wales P, Bell M, Torbeck L, Brokaw J., The current status of medical genetics instruction in U.S. and Canadian medical schools, Academic Medicine, 2007; 82 (5):441-445 IV – Genetic sequencing, ethics and society Genetic sequencing has been surrounded by controversy since the general public became aware of its existence. Like every new technology, fear of its misuse arose. One does not have to look farther than media in pop culture: the very well-known movie Gattaca, produced by Andrez Niccol in 1998, describes a world in which genetically inferior individuals are discriminated against, while those of perfect genetics get to live a life of privilege. This fear, for the most part dissipated in the last decades, is born of the misconception that genetics determine a person’s whole being, instead of being understood as the complex machinery it is, influenced by exterior factors such as environmental or biological ones. We hear every day of the discovery of new genes supposedly responsible for illness as varied as obesity, alcoholism or even serial-killer inclination. Nonetheless, one cannot simply measure their intelligence, athletic ability, or creativity with DNA analysis. For an employer, for example, it will never be as easy as undertaking a genetic test to determine a candidate’s intelligence; a relevant aptitude test is better-suited for that purpose. Another facet of the debate is prenatal genetic sequencing. Some worry parents will resort to abortion or genetic manipulation if the baby does not match their ideals. This concern, which has been around for decades now, predict this new future is incoming. As of right now, except in the case of medical reasons, this has not happened yet, but it might be because we still do not possess affordable technology for such advanced analysis on foetuses. In the case of genetic testing done by private companies like 23andme, the suspicion of sale of personal data to third-parties quickly arose. This matter is furthermore complicated by the fact that not everyone is able to consent – for example, minors, or unaware test subject’s relatives – but their lives could nonetheless be affected by the results. “In recent years, in China, for approximately $900, parents in Chongqing can send their children—ages three to 12 years old—to a five-day camp for DNA testing to identify their gifts and talents, so they can focus on those strengths from an early age. According to the director of the Chongqing Children’s Palace, “Nowadays, competition in the world is about who has the most talent. We can give Chinese children an effective, scientific plan at an early age.””16 On the opposite, in the United Kingdom, genetic testing on minors is only recommended for medical reasons and if delaying the test until their coming-of-age might have negative effects on their health. Privacy concerns have yet to be addressed by companies offering DTC testing or competent, relevant scientific and political entities. Ultimately, the main concern remains the misreading of the results: without professional assistance, clients can delude themselves into thinking they are perfectly healthy when they are not, or vice-versa. Those are all questions of ethical, scientific and medical nature that our society will have to learn to answer in the next decades, as DTC genetic testing goes on to be more affordable and accessible. 16 Source : Committee on Science, Technology, and Law, Forum on Drug Discovery, Roundtable on Translating Genomic-Based Research for Health, National Research Council, Institute of Medicine - Direct-to-Consumer Genetic Testing: Summary of a Workshop. Conclusion In conclusion, DNA sequencing is omnipresent: it is in every Life Sciences field, in every laboratory, every company. It is the result of decades of research and the combined efforts of hundreds of scientists. Frederick Sanger played an important role, as the biochemist responsible for the sequencing of inulin, then RNA and finally DNA. His method, though improved upon, has been used continuously for decades and still is, for it is reliable and cost-effective. Amongst others, two new fields emerged out of the birth of this technology: pharmacogenomics and DTC genetic testing, each one with its advantages and challenges. The fear surrounding DNA sequencing technology, justified or not, is well and truly present. Bioethicists warn against the dangers and perversions of this technology, while, on the other extreme, others commend its usefulness and potential for humanity’s betterment. As with most things, the truth is probably somewhere in the middle. Nonetheless, if humanity wants to progress with this technology, it will have to carefully take into consideration many factors and be vigilant. Bibliography Hugues, J. N. “Apport de La Pharmacogénomique Dans l’individualisation Des Réponses Thérapeutiques.” Gynecologie Obstetrique & Fertilite, vol. 36, Dec. 2008, pp. 6–7. EBSCOhost, doi:10.1016/S1297-9589(08)75149-X. J.M. Drazen, E.K. Silverman, T.H. LeeHeterogeneity of therapeutic responses in asthma. Br. Med. Bull., 56 (4) (2000), pp. 1054-1070 Wikipedia contributors. "DNA sequencing." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 18 Apr. 2020. Web. Katsila, Theodora, and George P Patrinos. “Whole genome sequencing in pharmacogenomics.” Frontiers in pharmacology vol. 6 61. 26 Mar. 2015, doi:10.3389/fphar.2015.00061 The Ubiquitous Pharmacogenomics program website. Link: http://upgx.eu/ Pirmohamed, M. “Pharmacogenetics and pharmacogenomics.” British journal of clinical pharmacology vol. 52,4 (2001): 345-7. doi:10.1046/j.0306-5251.2001.01498.x Wikipedia contributors. "Bloom syndrome." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 10 Apr. 2020. Web. Wikipedia contributors. "Disulfide." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 12 Apr. 2020. Web. Joe S. Jeffers, Frederick Sanger, Two-Time Nobel Laureate in Chemistry, 2017, Springer editions. Wikipedia contributors. "Messenger RNA." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 18 Apr. 2020. Web. Committee on Science, Technology, and Law, Forum on Drug Discovery, Roundtable on Translating Genomic-Based Research for Health, National Research Council, Institute of Medicine, Direct-to-Consumer Genetic Testing: Summary of a Workshop. 2010. National Academies Press. Mark Henderson, Human genome sequencing: the real ethical dilemmas, 9 Sep 2013, Newspaper The Guardian. Thurston V, Wales P, Bell M, Torbeck L, Brokaw J., The current status of medical genetics instruction in U.S. and Canadian medical schools, Academic Medicine, 2007; 82 (5):441-445 Food and Drug Administration website, FDA allows marketing of first direct-to-consumer tests that provide genetic risk information for certain conditions, 6 avril 2017 Shendure, Jay & Balasubramanian, Shankar & Church, George & Gilbert, Walter & Rogers, Jane & Schloss, Jeffery & Waterston, Robert. (2017). DNA sequencing at 40: Past, present and future. Nature. 550. 10.1038/nature24286. Appendixes Title: insulin structure, with its disulfide bonds.