Nucleotide Composition of Ultra-Conserved Elements Shows Excess of GpC and Depletion of GG and CC Dinucleotides

Significance 

Ultra-conserved noncoding elements (UCNEs) are ~300 nucleotide-long stretches of DNA sequences that are highly conserved across mammals and birds but do not code for proteins. These elements might play important roles in the regulation of gene expression and are therefore potentially valuable in medicine. For instance, several UCNEs have been found to be associated with various diseases, including cancer, neurological disorders, and developmental disorders. By analyzing UCNEs, scientists can identify disease-causing mutations and develop diagnostic tests. UCNEs can be used to identify new drug targets. By targeting UCNEs, researchers can develop drugs that modulate gene expression and treat diseases. UCNEs can be used to regulate the expression of genes in gene therapy. Gene therapy involves introducing a functional copy of a gene into a patient’s cells to treat a disease. However, it is important to regulate the expression of the gene to ensure that it is expressed at the right time and in the right cells. UCNEs can be used to achieve this regulation. Therefore, by studying UCNEs, we could gain insights into the evolution of gene regulation and the development of complex organisms.

UCNEs are identified through nucleotide sequence identity between species from distant phyla and do not contain DNA-repetitive elements except for small simple repeats. UCNEs are unique genomic sequences or exist in only a few copies and do not share any sequence similarity with other members of ultra-conserved elements, so no sequence motifs have ever been considered among UCNEs that require these genomic elements. Despite the numerous single nucleotide polymorphisms (SNPs) inside UCNEs, only a limited number of mutations have been associated with human disorders or biological conditions, and it is still unclear how numerous mutations inside UCNEs have avoided fixation. UCNEs are too lengthy to be protein binding sites and do not fit the modern view of non-coding RNAs, which primarily have evolutionary conservation in structure and not in sequence. Hence, while the “fierce purifying selection upon fixation” concept about UCNEs is logical, it remains unanswered.

In a new study published in the peer-reviewed journal Genes, Dr. Larisa Fedorova, Oleh Mulyar, Jan Lim and Professor Alexei Fedorov from CRI Genetics LLC in Santa Monica and the University of Toledo, UCNE sequences could create a DNA helix with unusual features because of their particular dinucleotide makeup. They hypothesize that abnormalities in the dinucleotide distribution of UCNEs may be responsible for their biological roles through DNA conformation and the evolutionarily conserved nature of these elements.

The three-dimensional shape of the DNA double helix is determined by two types of interactions between nucleotides: Watson-Crick base pairing between nucleotides from opposite strands, and Pi-stacking interactions between adjacent nucleotides from the same strand. The GpC dinucleotide is found more frequently than expected within UCNE sequences, leading researchers to hypothesize that these sequences may have unique properties in their DNA structure. By analyzing mutations within UCNEs, researchers found that each person has more than 300 mutations within the full set of 4271 UCNEs in humans, and that every person is likely to be a homozygote for several mutant UCNE sequences, as well as a heterozygote for at least 300 mutations inside UCNEs. While rare alleles are relatively overrepresented inside UCNEs, common SNPs with an alternative allele frequency are 30-50% inside UCNEs, are 3.2 times less than expected for the whole genome. The researchers propose two conjectures to explain the paradoxical observation that numerous rare UCNE mutations are not fixed. The first hypothesis is based on the idea that the effectiveness of natural selection is directly related to the number of offspring per individual, and that mutations inside UCNEs may affect the fitness of gametes. This is because males produce millions of sperm, and competition among these sperm for survival may result in effective selection against a large number of mutations. The second conjecture suggests that natural selection may not be the primary force driving UCNE SNP dynamics, and that some unknown molecular process may be at play. In the end, the mystery behind the existence of so many ultra-conserved elements has to be investigated further by experts.

In conclusion, researchers analyzed the nucleotide composition of 4273 human ultra-conserved noncoding elements (UCNEs) that have been preserved for over 300 million years of evolution. They found that UCNEs have a high level of polymorphism, with over 300 mutations per individual. Researchers also found that UCNEs have no association with non-coding RNAs or meiotic recombination rates. However, they discovered that UCNEs have distinctive dinucleotide patterns, with an excess of GpC and a depletion of GG and CC dinucleotides. They hypothesized that these dinucleotide abnormalities may create unique 3D structures and protein-binding specificities for UCNEs, which may explain their extraordinary conservation. The authors’ findings advance our understanding in using UCNEs to diagnose diseases, develop new drugs, regulate gene expression in gene therapy, and gain insights into the evolution of gene regulation.

Reference

Fedorova L, Mulyar OA, Lim J, Fedorov A. Nucleotide Composition of Ultra-Conserved Elements Shows Excess of GpC and Depletion of GG and CC Dinucleotides. Genes. 2022;13(11):2053.

Go To Genes.