It’s in the Genes!
- A person’s genome contains information showing possible advance of a disease, or a propensity for the individual to develop a particular disorder. However, you need a fast, reliable and economical way of sequencing each patient's genes to take full advantage of them. This is not a one-time deal and there is the need to continually sequence an individual's DNA over his or her lifetime, because the genetic code can be modified by many factors.
- Deoxyribonucleic acid is a nucleic acid, in a ladder-like, double helix structure (discovered in 1953 by James Watson and Francis Crick), containing the genetic instructions used in the development and functioning of all known living organisms (with the exception of RNA viruses). The rungs of the ladder-like helix DNA consists of the consistent pairing of A==T and G==C base pairs. This complementary relationship is critical in gene function. Within cells, DNA is organized into long structures called chromosomes The DNA segments carrying this genetic information are called genes. During cell division these chromosomes are duplicated in the process of DNA replication, providing each cell its own complete set of chromosomes. Along with RNA and proteins, DNA is one of the three major macromolecules that are essential for all known forms of life.
- DNA consists of two long polymers of simple units called nucleotides, with backbones made of sugars and phosphate groups joined by ester bonds. Genomic DNA is tightly and orderly packed in the process called DNA condensation to fit the small available volumes of the cell. In eukaryotes (animals, plants, fungi, and protists), DNA is located in the cell nucleus, as well as small amounts in mitochondria and chloroplasts. Eukaryotic organisms store most of their DNA inside the cell nucleus and some of their DNA in organelles, such as mitochondria or chloroplasts. Eukaryotic cells, in general, are bigger and more elaborate than bacteria and archaea. In many species, only a small fraction of the total sequence of the genome encodes protein. Only about 1.5% of the human genome consists of protein-coding exons, with over 50% of human DNA consisting of non-coding repetitive sequences.
- Bioinformatics (one of the thrusts of this website) involves the manipulation, searching, and data mining of biological data, and this includes DNA sequence data. Computer science, utilizing string searching string algorithms, machine learning, and database theory have led to big advances in genomics. Many of these processes deal with pattern recognition. The technology involved in DNA sequencing is quite expensive, running into the thousands. There is therefore a demand for a more cost-effective method for this job.
New Approach to Old Data…
- Researchers from the National Institute of Standards and Technology (NIST) and Columbia University's School of Engineering and Applied Science are collaborating to devise a new rapid and less-expensive procedure that can be commercialized effectively. If this pushes through that means that doctors and other health personnel will be able to order these tests from their clinic or offices with greater ease.
- The new method, which is some kind of molecular ticker-tape, can perform DNA sequences by attaching distinct molecular "tags" to each of the four chemical building blocks, or "bases," that comprise the genetic information in a strand of DNA —abbreviated as G, A, C and T. That’s G=Guanine, A=Adenine, C=Cytosine, and T=Thymine. Each of these polymer tags can then be cut from the strand and passed, one by one, through a nanometer-size hole in a membrane. A steady stream of fluid and ions flows through this "nanopore," which is large enough to contain only one tag at a time. As the polymer tags are different sizes, the change in electrical current caused by altered fluid flow shows which of the four bases sits at each point on the DNA strand. Columbia University has applied for patents for the commercialization of the technology.
- Transcription factors are the DNA binding proteins that carry out the organic process whereby the DNA sequence in a gene is copied into mRNA. Gene expression, then, is the process by which a gene’s coded information is converted into the structures present and operating in the cell. Expressed genes include those that are transcribed into mRNA and then translated into protein, and, those that are transcribed into RNA but not translated into protein (e.g., transfer and ribosomal RNAs).
- DNA can be damaged, wherein there is a DNA alteration that has an abnormal structure, which cannot itself be replicated when the DNA is replicated, but which may be repaired. Mutation can occur too, wherein there is a change in the sequence of DNA base pairs, but which may be replicated and thus inherited. DNA is subject to a wide variety of chemical reactions that might be expected of any such molecule in a warm aqueous medium. In any cell, however, some DNA damage may remain unrepaired despite repair processes.
Animation of a rotating DNA structure.
Created in the free program RasMol 22.214.171.124.1
using the bdna.pdb dataset and the following commands:
GUI options were: Options->Specular, Display->Sticks
Then, in the command line
where script.txt was a separate text file
in the same directory, consisting of the
same two repeating lines:
rotate x 3
rotate x 03
rotate x 3
from 001 up to 119 (for a total of 120 images when
combined with frame-000.gif).This was then loaded
in Adobe ImageReady, and saved as an animated GIF.
Father of Genetics
- In 1866, Gregor Mendel, an Augustinian monk, put forth the postulates of inheritance based on a decade long work in pea plants. Today, we can have a DNA test in an hour or so. Mendel did his work before the structure and role of chromosomes were known. About 20 years after his work, advances in microscopy allowed researches to identify chromosomes and to establish that in eukaryotic organisms (meaning those with a nucleus and cell membranes), each species has a identifiable characteristic number of chromosomes called the diploid number. You and I have a diploid number of 46. Chromosomes in diploid cells exist in pairs called homologous chromosomes, and members of a pair are identical in size and location of the centromere where the spindle fibers attach to during division.
From Genotype to Phenotype
- In living organisms DNA does not usually exist as a single molecule, but instead as a pair of molecules that are held tightly together. These two long strands entwine like vines, in the shape of the now known double helix. DNA is a long polymer made from repeating units called nucleotides. Polymers comprising multiple linked nucleotides are called a polynucleotide. The backbone of the DNA strand is made from alternating phosphate and sugar residues. The D in DNA stands for 2-deoxyribose, which is a pentose (five-carbon) sugar. In turn, the sugars are joined together by phosphate groups that form phosphodiester bonds between the third and fifth carbon atoms of adjacent sugar rings.
- Proteins are the end products of gene expression. Once a protein is made, its action or location plays a role in producing a phenotype. When mutation alters a gene, it may abolish or alter that protein’s function and cause an altered phenotype. Genomics grew out of recombinant DNA technology which began in the early 1970s, when researchers discovered that bacteria could protect themselves by making enzymes that restrict or block infection by cutting viral DNA at specific sites. When cut, the viral DNA could not orchestrate the synthesis of more phage particles which when released kill the infected bacterial cell.
- A cell’s genome, that is, the entire library of genetic information in its DNA, provides a genetic program that instructs the cell how to function, and, for plant and animal cells, how to grow into an organism with hundreds of different cell types. The DNA contains the information that not only determines form, but also the function of the cell and organism. Once genomic “libraries” became available, researchers started working on ways and means to sequence all the clones in a genomic library in an organized way, so as to obtain the neucleotide sequence of an organism’s genome.
- In 2001, the Human Genome Project reported the first template of the human genome. Then in 2003, the remaining portion of the genome sequence was published. Work is focused now on the non-coding portions of the genome. Genomics arose as a new field of Medicine as more complete genomes were discovered and published. Now, we have gene sequences of a variety of organisms, ranging from bacteria and fungi, to insects, all the way to large mammals. Hemophilus influenza, which causes respiratory illness and meningitis, was the first free-living organism to have its genome sequenced. The third generation of model organisms include the roundworm (C. elegans), the plant Arabidopsis, and , the Zebrafish. We also have gene sequences of plants, which is a boon to the agricultural industry. It is estimated that more than 60% of the processed food in the United States contains ingredients made from genetically modified crop plants. These modifications are done to provide crops with different enhancements, ranging from herbicide resistance, drought- or flood-resistance, all the way to delayed ripening.
- The screening of an individual’s genome to determine a person’s risk for developing a disease uses DNA microarrays or DNA chips. Each chip contains thousands of fields, each carrying a different gene. Chips containing the human genome are widely available, making it feasible to scan one’s entire genome to check for a propensity for a disease. Pharmagenomics checks out which chemical compound best treats a certain illness. In addition, we have gene therapy, which involves transferring normal genes into an abnormal cell or tissue. The molecular basis for hundreds of genetic disorders is now known. We must remember that genetic studies rely on known model organisms. Gene sequencing involves the comparison of gene sequences of one organism to a reference sequence.
We are all connected
- So now, one can visit the BLAST portal online, for access to gene sequences from many organisms with known sequences. Model organisms used to study human disease include, for example, yeast, for: cell cycle, cancer, and Werner Syndrome; Drosophila (fruit fly) for: cell signaling, cancer, and human neurodegenerative disease; Zebrafish, for developmental pathways, and cardiovascular disease. These model organisms have a rich history in genetic studies. One should remember that whatever controls a certain process in yeast will also be the same for the human. The development and use of model organisms is only one of the ways genetics and biotechnology are rapidly changing every aspect of our lives. I hope that the reader will look into the wealth of material on genetics, epigenetics, as well as the other omics (e.g. proteomes, exomes).
Some Tools of the Trade
- There are an ever-growing number of bioinformatics resources that are available to the interested worker. Depending on what area of specialty you’re in, you will find work opportunities as well. To get one started, do read further on in all the published documents available now on the Web. Check out the BLAST portal here. Equipment can range from a light microscope all the way to the transmission- and scanning electron microscope. On the software side, there are several sequencing programs available. Some are free and open-sourced, while others come at a big price. A brief listing of software is on this site. As new developments are pursued, the armamentarium of the bioinformatics worker will grow exponentially. All it takes is a good helping of resourcefulness and you’re on your way to making the next big discovery.
Think Big. Think Forward
- Fernando Yaakov Lalana, M.D.
2) Wolfsberg T, McEntyre J, Schuler G (2001). "Guide to the draft human genome". Nature 409 (6822): 824–6. doi:10.1038/35057000. PMID 11236998.
3) Engineers collaborate on inexpensive DNA sequencing method. R & D Magazine http://www.rdmag.com/News/2012/10/Life-Science-Genetics-Analytical-Instrumentation-Engineers-collaborate-on-inexpensive-DNA-sequencing-method/?et_cid=2880813&et_rid=424458894&linkid=http%3a%2f%2fwww.rdmag.com%2fNews%2f2012%2f10%2fLife-Science-Genetics-Analytical-Instrumentation-Engineers-collaborate-on-inexpensive-DNA-sequencing-method%2f
4) PEG-Labeled Nucleotides and Nanopore Detection for Single Molecule DNASequencing by Synthesis, Shiv Kumar, Chuanjuan, Tao, Minchen, Chien, Hellner, Brittney, Balijepalli, Arvind Robertson, Joseph W. F. , Li, Zengmin, Russo, James J., Reiner, Joseph E., Kasianowicz, John J.,& Ju,, Jingyue.; Nature.com, Scientific Reports.
5) Leff ,Todd and Granneman, James G., in Encyclopedia of Molecular Cell Biology and Molecular
Medicine, 2nd Edition. Edited by Robert A. Meyers., Copyright 2004 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. ISBN: 3-527-30543-2
6) Klug, William S., Cummings, Michael R., Spencer Charlotte A., Concepts Of Genetics 8th Ed.,©2006,2003,2000,1997 by William S. Klug and Michael R. Cummings. Published by Pearson Education, Inc. Pearson Prentice Hall. ISBN: 0131918338
7) Fortini, M. and Bonini,N.M.;2000 Modeling human neurodegenerative disease in Drosophila. Trends Genet.16:161-167