For the researchers who spent more than a decade mapping the entirety of our species’ DNA for the Human Genome Project, the results of that $3-billion effort were surprising, to say the least.
Of the three billion letters that make up our genetic code, it turns out that only around 2% of them are dedicated to coding for proteins. Widely considered the building blocks of life, proteins are essential to virtually every function of the human body, such as transporting nutrients, providing energy, building and repairing tissues, carrying out chemical reactions, regulating bodily processes, and much more.
Yet, despite having more than 200 types of cells, we have just 20,000 unique protein-coding genes. This is a comparable number to much simpler organisms, such as worms.
More than two decades after the mapping and sequencing effort was declared complete in April 2003, the remaining 98% of the human genome largely remains a mystery. However, although this so-called dark genome has no obvious purpose, it is not likely to be “junk DNA,” as some geneticists originally believed. Instead, most agree that there must be an evolutionary reason for maintaining a genome of this size, even if the vast majority of our DNA does not code for proteins.
Though much more research is needed to understand the dark matter in our genome, it appears to play an important role in how the genes are expressed in response to external factors such as diet, exercise, and sleep. According to molecular biologist Samir Ounzain, if the protein-coding genes are the body’s hardware, then the dark genome serves as the software that helps us adapt to information from our environment.
More about the dark genome:
- Nearly half of our genome (and the genomes of other mammals) is made up of repetitive DNA sequences called transposons. One popular theory for the existence of transposons is that they originally came from viruses that invaded our ancestors’ germline cells and were passed down to their descendants, becoming integrated into the genome with dramatic results. Transposons have the surprising ability to move between genes, causing or undoing mutations that can be passed down among members of a population.
- Another role of the dark genome is to produce non-coding RNA molecules that can impact protein production or gene expression. Researchers have theorized that environmental and lifestyle cues like smoking or lack of exercise can result in the production of RNA molecules linked to inflammation, cell death, and even tumor formation.
- In response to this discovery, many biotech companies are working on developing drugs that disrupt these non-coding RNA molecules and their impact on gene expression. Because the activity of non-coding RNA is so specific, especially compared to the relatively few protein-coding genes, targeting these molecules could result in safer medicines with fewer side effects.