Pencil Stubs Online
Reader Recommends


Armchair Genealogy

By Melinda Cohenour

DNA: A Glossary of Terms

One can Google "DNA Glossary" and be provided a plethora of sites offering the definition of terms used when discussing or studying DNA. That, in my estimation, is both a wondrous thing and a stumbling block. Why? Well, the wonder is that so many sites are available and each offers tons of terms. But for those of us who don't seek a degree in DNA Scientology, there's simply too much data.

Perfect example: facts provided in large mass merely comprise DATA. Those datum must be organized and massaged and reissued as INFORMATION in order to provide clarity on any subject.

Therefore, for the purposes of this column your author has attempted to organize the defined terms in a logical fashion. I have not discovered a Glossary of terms for DNA that is not presented in an alphabetized order. Personally, as a newbie DNA-phobe (define that, Internet!), I need to look at structure and use before I even know which of the hundreds of terms I need to look up. (By the way, MY definition of DNA-phobe just means I yearn to comprehend enough to tackle the DNA Matches Ancestry provides me without constantly needing to stop and GOOGLE. Ok, I confess. I have a greater and broader interest in DNA that encompasses a fascination with its many uses, new discoveries and advances, as well as my use of DNA to try to solve the mysteries encountered in building our family tree.)

The intent of this Glossary is to build our understanding from the core out. A basic oversight of how we're made, from the tiniest inner particle to how we comprehend how our DNA Matches occur. Bear with me. This is not an easy task but my intent is to keep it as simple as possible while including essential information.

DNA GLOSSARY - In Logical Order

A genome is all of a living thing's genetic material. It is the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life on to the next generation. The whole shebang.

In most living things, the genome is made of a chemical called DNA. The genome contains genes, which are packaged in chromosomes and affect specific characteristics of the organism.

Imagine these relationships as a set of Chinese boxes nested one inside the other. The largest box represents the genome. Inside it, a smaller box represents the chromosomes. Inside that is a box representing genes, and inside that, finally, is the smallest box, the DNA.

In short, the genome is divided into chromosomes, chromosomes contain genes, and genes are made of DNA.

The word " genome " was coined in about 1930, even though scientists didn't know then what the genome was made of. They only knew that the genome was important enough, whatever it was, to have a name.


GENOME: The genome is the entire set of genetic instructions found in a cell. In humans, the genome consists of 23 pairs of chromosomes, found in the nucleus, as well as a small chromosome found in the cells' mitochondria. Each set of 23 chromosomes contains approximately 3.1 billion bases of DNA sequence.


I've chosen to build from the smallest box up while leaving the complexity of DNA and RNA for last.

A gene is a small piece of the genome. It's the genetic equivalent of the atom: As an atom is the fundamental unit of matter, a gene is the fundamental unit of heredity.
Genes are found on chromosomes and are made of DNA. Different genes determine the different characteristics, or traits, of an organism. In the simplest terms (which are actually too simple in many cases), one gene might determine the color of a bird's feathers, while another gene would determine the shape of its beak.
The number of genes in the genome varies from species to species. More complex organisms tend to have more genes. Bacteria have several hundred to several thousand genes. Estimates of the number of human genes, by contrast, range from 25,000 to 30,000.

SOURCE: ibid..

A cell is the basic building block of living things. All cells can be sorted into one of two groups: eukaryotes and prokaryotes. A eukaryote has a nucleus and membrane-bound organelles, while a prokaryote does not. Plants and animals are made of numerous eukaryotic cells, while many microbes, such as bacteria, consist of single cells. An adult human body is estimated to contain between 10 and 100 trillion cells.


A nucleus is a membrane-bound organelle that contains the cell's chromosomes. Pores in the nuclear membrane allow for the passage of molecules in and out of the nucleus. (ibid..)

Chromosomes are packages of DNA found in the nucleus of cells. Humans have 46 chromosomes.


MEC NOTE: We inherit 23 chromosomes from our mother and 23 from our father, including the "X" chromosome that determines sex. See below.

Our parents each inherited half their chromosomes from each parent. So did their siblings, and our siblings, but NONE of us inherit the SAME ones; thus ensuring our unique characteristics.

CHROMOSOME: (another source)
A single continuous strand of DNA that can be anywhere from 50 million bases (chromosome 21) to 250 million bases (chromosome 1). Every person has 23 pairs (or 46 total) chromosomes — one of each pair from your mom and one from your dad. These 23 chromosomes are labeled 1 through 22, and one pair of chromosomes is called the sex chromosome, because women have two X chromosomes (XX) and men have one X and one Y (XY)


MEC NOTE: Now we take up DNA definitions

DNA (Deoxyribonucleic acid)
DNA stands for deoxyribonucleic acid. It is the genetic information that every parent passes on to their biological children. DNA plays a role in physical features (height and eye color), in disease (multiple sclerosis, cancer, Alzheimer’s Disease), and even behavioral traits (risk-taking). DNA is made up of four letters (A, C, T, and G) also known as bases (see Base). You can think of DNA as the instructions that we are born with, that are in almost every cell in our body that tells our bodies how to grow and function.

The most basic unit of DNA. There are four different bases (Adenine, Cytosine, Thymine, and Guanine), and they make up all DNA. The same four bases are in your DNA, an elephant’s DNA, even the DNA in corn. The bases are what we read when we sequence your DNA, and the order of these letters conveys information that offers us insight into what makes you, you.

Also sometimes called the ‘rungs’ of the DNA ladder. The Adenine base is always across from Thymine (A base pairs with T) and the Cytosine base is always across from Guanine (C base pairs with G). This pairing system is universal — it is never broken. Therefore, if you know the base on one side of DNA, you always know the base on the other side.


MEC NOTE: Following the historic process of capturing the microscopic photographs of DNA, science has steadfastly worked to understand its structure better and to use that knowledge to unlock the mysteries of the building block of life.
In order to utilize DNA, scientists had to learn to extract it. It was then necessary to determine how to prepare it to examine or "read" it. It has taken decades to advance that science from manual interpretation to computerized methods.
Only the most basic definition of terms can be included here, but in-depth exploration of any technique or aspect of use can be gained by visiting the source websites listed herein.
The inclusion of RNA here is to anticipate the query, "How does RNA differ from DNA?" Suffice it to say RNA seems to be the translator for DNA, the project manager if you will, ensuring the DNA coding is activated and carried out according to the "blueprint." It differs in structure as well, having a single spiral helix rather than the double helix construct of DNA. As RNA does not form the BASIS of our genetic creation, this shall be the sum total reference to it herein.

RNA: (Ribonucleic acid)
This flexible molecule tells the cell's protein-making factories what DNA wants them to do, stores genetic information and may have helped life get its start. More than just DNA's lesser-known cousin, RNA plays a central role in turning genetic information into your body's proteins.


RNA, in one form or another, touches nearly everything in a cell. RNA carries out a broad range of functions, from translating genetic information into the molecular machines and structures of the cell to regulating the activity of genes during development, cellular differentiation, and changing environments.


DNA Structure - the Double Helix
Double helix is the description of the structure of a DNA molecule. A DNA molecule consists of two strands that wind around each other like a twisted ladder. Each strand has a backbone made of alternating groups of sugar (deoxyribose) and phosphate groups. Attached to each sugar is one of four bases: adenine (A), cytosine (C), guanine (G), or thymine (T). The two strands are held together by bonds between the bases, adenine forming a base pair with thymine, and cytosine forming a base pair with guanine.

"A double helix has become the icon for many, many kinds of discussions about where science has been and where it's going. This really is an amazing structure. You can't stare at the double helix for very long without having a sense of awe about the elegance of this information molecule DNA, with its double helical form basically being the way in which all living forms are connected to each other, because they all use this same structure for conveying that information. Of course, this is Watson and Crick's incredible realization back in 1953, but it will stand in history as probably one of the most significant scientific moments of all time."
Francis S. Collins, M.D., Ph.D.



DNA extraction is a routine procedure used to isolate DNA from the nucleus of cells. When an ice-cold alcohol is added to a solution of DNA, the DNA precipitates out of solution. If there is enough DNA in the solution, you will see a stringy white mass.


MEC NOTE: This site offers step-by-step instructions. More information than is needed here.

Sequencing DNA means determining the order of the four chemical building blocks - called "bases" - that make up the DNA molecule. The sequence tells scientists the kind of genetic information that is carried in a particular DNA segment. For example, scientists can use sequence information to determine which stretches of DNA contain genes and which stretches carry regulatory instructions, turning genes on or off. In addition, and importantly, sequence data can highlight changes in a gene that may cause disease.
In the DNA double helix, the four chemical bases always bond with the same partner to form "base pairs." Adenine (A) always pairs with thymine (T); cytosine (C) always pairs with guanine (G). This pairing is the basis for the mechanism by which DNA molecules are copied when cells divide, and the pairing also underlies the methods by which most DNA sequencing experiments are done. The human genome contains about 3 billion base pairs that spell out the instructions for making and maintaining a human being.


Autosomal, Mitochondrial, and Y-DNA: The Three DNA Tests Used by Genealogists
There are three sources of information in a DNA sample. Y-chromosomal DNA (Y-DNA) is present only in samples from males and gives information on patrilineal descent. Mitochondrial DNA (mtDNA), present in both male and females, gives information on matrilineal descent. Finally, autosomal DNA (atDNA) gives information on both matrilineal and patrilineal descent.
The signal of shared ancestry seen in autosomal DNA is highest in close relatives, but dilutes quickly so that by 5-7 generations of separation, it is difficult to distinguish exact relationships other than shared ethnic affinities. Thus, autosomal DNA (atDNA) is best to help identify ancestors within the most recent 5–7 generations of a family tree.

MtDNA and Y-DNA tests are limited to relationships along a strict female line and a strict male line, respectively. mtDNA evolves rapidly whereas Y-DNA (and atDNA) changes much more slowly. MtDNA and Y-DNA tests are utilized to identify archeological cultures and migration paths of a person's ancestors along a strict mother's line or a strict father's line. Based on MtDNA and Y-DNA, a person's haplogroup(s) can be identified. (A haplogroup is DNA or Chromosomal segments derived from a group of people who share a common genetic ancestor). The mtDNA test can be taken by both males and females, because everyone inherits their mtDNA from their mother, as the mitochondrial DNA is located in the egg cell. However, a Y-DNA test can only be taken by a male, as only males have a Y-chromosome.


A genetic marker is a gene or DNA sequence with a known location on a chromosome that can be used to identify individuals or species. It can be described as a variation that can be observed. Wikipedia

A centimorgan is a unit used to measure genetic linkage. One centimorgan equals a one percent chance that a marker on a chromosome will become separated from a second marker on the same chromosome due to crossing over in a single generation. It translates to approximately one million base pairs of DNA sequence in the human genome. The centimorgan is named after the American geneticist Thomas Hunt Morgan.


Shared DNA segments, also referred to as 'matching segments', are the sections of DNA that are identical between two individuals. These segments were most likely inherited from a common ancestor.
DNA segments can be found on all of the 22 autosomal chromosomes. The segment length is determined by the centiMorgan distance between the first SNP and the last SNP. The longer the shared segment is, the higher the probability that it was inherited from a common ancestor, which means that the two people are genetically related.
All 22 pairs of chromosomes add up to a total of about 7000 centiMorgans. Half is inherited from your mother, and the other half from your father.
You can use the following range of average values as a reference for the length of shared segments between you and your relatives:

Identical twin: 7000cM
Parents: 3350 - 3600 cM
Full siblings: 2300 - 2900cM
Grandparents and aunts/uncles: 1300 - 2200cM
First cousins: 600 - 1200cM

Keep in mind that the above values represent average ranges. In some cases, matches can have lower or higher centiMorgan values.


SNP: What are single nucleotide polymorphisms (SNPs)?
Single nucleotide polymorphisms, frequently called SNPs (pronounced “snips”), are the most common type of genetic variation among people. Each SNP represents a difference in a single DNA building block, called a nucleotide. For example, a SNP may replace the nucleotide cytosine (C) with the nucleotide thymine (T) in a certain stretch of DNA.
SNPs occur normally throughout a person’s DNA. They occur almost once in every 1,000 nucleotides on average, which means there are roughly 4 to 5 million SNPs in a person's genome. These variations may be unique or occur in many individuals; scientists have found more than 100 million SNPs in populations around the world. Most commonly, these variations are found in the DNA between genes.
They can act as biological markers, helping scientists locate genes that are associated with disease. When SNPs occur within a gene or in a regulatory region near a gene, they may play a more direct role in disease by affecting the gene’s function.
Most SNPs have no effect on health or development. Some of these genetic differences, however, have proven to be very important in the study of human health. Researchers have found SNPs that may help predict an individual’s response to certain drugs, susceptibility to environmental factors such as toxins, and risk of developing particular diseases. SNPs can also be used to track the inheritance of disease genes within families. Future studies will work to identify SNPs associated with complex diseases such as heart disease, diabetes, and cancer.


An example of a SNP is the substitution of a C for a G in the nucleotide sequence AACGAT, thereby producing the sequence AACCAT. The DNA of humans may contain many SNPs, since these variations occur at a rate of one in every 100–300 nucleotides in the human genome.


This Glossary barely skims the list of words, techniques, scientific procedures, and other terms associated with DNA. As evidenced by almost daily news of the use of DNA miraculously extracted from stored evidence bags to solve decades old rape and murder cases, DNA is Big News. In years past, methods used to extract DNA were so elementary they quickly destroyed the source item being tested. Thus, Law enforcement hesitated to put the evidence at risk of loss.

Advancements in technology have resulted in a replication method that can utilize a minute sample and create exact copies, much as a photocopier can shoot out identical versions of a document. Now there is a way to, in reality, preserve that DNA evidence forever. The combined use of DNA matchups on testing sites and classic genealogical research with a touch of logical detective work has resulted in hundreds of rape and murder cases being solved. The same methods have also brought about identification of Jane and John Does - those unidentified victims whose bones have long resided in coroner's storage areas awaiting return to their grieving relatives for proper burial.

Most of my readers seek a better understanding of what their own DNA test results mean. They seek to understand the unfamiliar jargon that accompanies their results, therefore helping them to place their cousins in their tree with a clear path back in time to their common ancestor. Some, like your author, have tried for years to solve mysteries ... Who were my first husband's biological parents, my children's grandparents and how far back can that line be proven?

It is my hope my readers will find this Glossary helpful and informative, whatever their needs might be. Be sure to explore the source websites for amplification and in-depth understanding of terms that excite or intrigue you. Another way to do Armchair Genealogy!

Click on author's byline for bio and list of other works published by Pencil Stubs Online.


Refer a friend to this Column

Your Name -
Your Email -
Friend's Name - 
Friends Email - 


Horizontal Navigator



To report problems with this page, email Webmaster

Copyright © 2002 AMEA Publications