We think of deoxyribonucleic acid (DNA) as being housed within the nucleus of every cell, but in reality, our bloodstream is full of cell-free DNA. In fact, our blood circulates a plethora of mysterious microscopic material, from molecules secreted by our own organs, to cell-free DNA from bacteria and viruses present in the body. In fact, by detecting this cell-free DNA, cultivating it and analyzing its origins, scientists have identified many of the microbes that have colonized the human body, including thousands of species of bacteria living on our skin and in our guts.
And yet, much of DNA has remained “dark”. Microbes are typically studied by collecting samples, culturing them in the lab and identifying the species of bacteria based on the characteristics of the colony. This strategy becomes limited when considering that about 99% of microorganisms have not yet been cultivated. Scientists have been able to overcome this problem through a novel technology called “single cell biotechnology” within the field of metagenomics, where DNA samples are directly collected from the environment (or bloodstream), copied and directly sequenced. These sequences are then compared with sequences of known microbes and mapped according to similarity and differences, giving us an incredible insight into the diversity of microorganisms and unveiling the biological dark matter.
Dark Matter in Human Cells
As previously noted, DNA contains the blueprints to all of our bodily functions. For example, our cells create proteins based on descriptions from genes, which “code” for proteins. These coding regions in the DNA relay instructions to construct proteins that carry out many functions, including breaking down food in digestion, fighting infection in our immune system and even helping our neurons send signals. The gene is first read, and a rough draft is transcribed, called RNA or ribonucleic acid, which is then edited and translated into a protein. But only about 1.2% of human DNA is actually coding DNA!

Biological dark matter is what scientists have called segments of genetic material that we can’t quite explain – yet. (Source: Pexel.com)
Scientists once called the remaining noncoding DNA “junk DNA” or “dark matter DNA”, puzzled by these apparently functionless stretches of DNA. Noncoding DNA is also unique because of another astounding trait: these sequences are almost identical in several animals, including humans, mice, rats, chickens and rhesus monkeys. This “ultraconservation” is highly unusual, as DNA sequences commonly have slight differences between species, and harmless minor mutations routinely occur between generations of a species too. It follows then, that there is likely some evolutionary importance to these DNA sequences being identical. If there are no living organisms with mutations in these sequences, then these specific sequences are likely crucial for survival. But if this DNA does not code for proteins, what could its function be?
About 481 DNA sequences are ultraconserved. A 2007 experiment found that when four random ultraconserved sequences were deleted, no difference in viability or survival resulted. The mice born with deleted ultraconserved sequences exhibited normal external appearance, metabolism, growth and life cycle. Thus, the ultraconservation of certain sequences did not appear to be important for survival.
In 2018, researchers from Stanford University found that some of these ultraconserved DNA regions were located adjacent to genes involved in brain development. Thus, these ultraconserved stretches were hypothesized to act as “enhancers”, meaning that they increased the likelihood that these important genes would be expressed. The researchers then chose the longest known ultraconserved enhancer, on the X chromosome located next to the Arx gene, which encodes a protein that drives neuronal development of embryos. When they deleted this gene through a genomic editing technique called CRISPR/Cas9, again the mouse embryos still developed and appeared okay. However, once they dissected the mice and examined their brain tissue where Arx is normally active, they noted decreased expression of the Arx gene and significant abnormalities, including parts of the forebrain like the hippocampus and the telencephalon. The neurons in these regions were severely disorganized and showed a lower density and diversity of neurons, suggesting that deficits in brain tissue development had occurred in the mice’s brains during development.
So why was this not detected by earlier experiments, like the one in 2007? Simply because neurological deficits are more subtle within an insulated lab environment. In the wild, however, abnormalities in these structures could mean the difference between survival and death. The hippocampal functions include memory processing and formation, while the telencephalon has several complex patterns and connections of neurons that, if disrupted, can result in epilepsy. In fact, the resulting developmental abnormalities closely resembled those seen in Alzheimer’s and epilepsy .

Experiments on mice have gradually uncovered the true importance of ultra conserved DNA. (Source: Pexels.)
Arx is not the only gene where this phenomenon occurs . A 2013 research group in Seattle also observed a different ultraconserved enhancer that controlled the expression of a gene that regulates the development of the entire neocortex. The neocortex is a brain structure that has only recently developed in mammals. The neocortex controls all of our higher cognitive functions, including memory, emotional processing, language comprehension, speech and a majority of voluntary movement. Its proper development depends on a series of interconnected, complex and regulated signals that control how neurons divide and organize during neurogenesis. Amazingly, this complex process was derailed by a deletion of a single enhancer that controlled a brain signaling gene.
Poison Exons
Of the 481 ultraconserved DNA segments, over 200 are enhancers, and damage to any of these segments can impair the development of body tissues in humans or other species . In addition to these enhancers, we also see some ultraconserved sequences behave oppositely, called “poison exons”. Rather than enhancing gene expression, poison exons repress and constrain a certain gene’s expression. Recall that the gene is first read and a rough draft of RNA is produced. This rough draft undergoes several stages of cutting and editing, called RNA splicing. RNA splicing is heavily controlled by other proteins, whose activity is in turn regulated by – you guessed it – poison exons! These poison exons majorly influence the editing process of proteins and may altogether prevent some proteins from being produced.
Why might it be beneficial for some proteins’ expression to be repressed? Some proteins simply do not need to be expressed at all times or in excess. In fact, if they are overexpressed, it can directly harm the organism. One research group based in Seattle deleted some of these ultraconserved poison exons in mice and noticed that it directly caused tumour formation. Because the deletion of the poison exon removed its normal inhibition of RNA splicing and protein expression, the splicing and expression of proteins controlling cell division occurred. With uncontrolled protein expression, the cells began dividing uncontrollably and tumour formation only halted when researchers reintroduced the gene to mice.
Similarly, another study found that ultraconserved regions also control the proper development of cardiac tissue, and any abnormal expression of ultraconserved DNA can lead to a pathological thickening of the heart tissue, also known as a serious heart condition called cardiac hypertrophy. Now we can hopefully see how important these stretches of ultraconserved DNA really are to our survival. Thus, biological dark matter within our DNA is actually a caretaker of our neurological development and a protector from cancer.
Future Directions and Gene Therapy
Now that we are unraveling the mystery of biological dark matter, where can we go from here? We have learned that ultraconserved DNA is a major genetic component of several disorders, whether they are neurological disorders, cancers or cardiac disorders. We can now use new and exciting genetic tools to examine these effects further. CRISPR/Cas9 is a novel genome-editing technique that we have borrowed from bacteria. Clusters of Regularly Interspaced Short Palindromic Repeats (CRISPR) are stretches of DNA or RNA that the CRISPR-associated enzyme (Cas9) can cut like a pair of scissors. Bacteria use the technique to cut out the DNA/RNA of infecting viruses, almost like an immune system. Scientists can use this technique to insert or remove genes in animal models and cell lines.

Scientists can use a gene editing tool called CRISPR/Cas9 to insert or remove harmful ultraconserved DNA and potentially prevent or treat diseases through gene therapy. (Source: Pexels.com)
Using CRISPR/Cas9, we can survey the impact of removing multiple segments of ultraconserved DNA, including poison exons and enhancers, on the function of genes and cell processes. We can see how such genetic editing effects can amplify an organism’s wellbeing, including embryonic development, early-life development and potential cancer formation. CRISPR/Cas9 research results can also offer us insight into therapies where we can insert or remove genetic segments that would have caused disease development. This is called “gene therapy”, in which scientists could produce a drug that can deliver this biological effect to a person’s cells using a viral vector.
With further research, who knows where biological dark matter will take us?