For two decades, scientists have had a file Almost complete sequencing of our genome : a team The Human Genetic Code Project He actually succeeded in identifying nearly three billion nuclear bases. However, some technological limitations prevented mapping of the entire genome: it was then estimated that 5 to 15% remained to be deciphered. In fact, it was 8%, which is the percentage that the T2T consortium announced they had finally sequenced. ” This background information will enhance the numerous ongoing efforts to understand all the functional nuances of the human genome, which in turn will advance genetic studies of human diseases. And the announce at statmentAnd the Eric GreenDirector National human genome research institute (NHGRI).
Genes “essential” to many cellular functions
Most of the newly decoded genetic information is located near telomeres and centromers – which respectively mark the ends of the chromosome and the areas of contact between its “arms” (chromatids). Reminder, each chromosome The human cell has 23 pairs of chromatids DNA is “wrapped”; The latter consists of a sequence of units called “nucleotides” or “nuclear bases”, which are denoted by the letters A, C, G and T.
More precisely, the specific genetic material is located on the short arms of chromosomes 13, 14, 15, 21 and 22, called “acrocentric” (which means that its centrosome is located near one end) and on chromosomes 1, 9, 16 and yes. ” More than half of the missing information was about the short arms of acrocentric chromosomes, which contain the ribosomal DNA genes necessary for the production of all our proteins. “,” Stylianos Antonarakis . explainsProfessor of Genetic Medicine at the University of Geneva Medical School.
>> Read also: This is how a cell reads its own DNA code!
Alors que cette partie du génome était auparavant considérée comme une sorte « d’ADN poubelle », sans fonction précise, il apparaît aujourd’hui que les gènes concernés sont en réalité essentiels essentiels à la compésé les ignes les ignésrée Mr. The genes in this non-coding DNA do not make proteins, but play critical roles in many cellular functions; In particular, they could be responsible for certain conditions in which cell division kicks in, as in the case of cancer.
The remaining 8% – made up of 151 million base pairs of DNA sequences scattered throughout the genome – wasn’t too trivial. ” We are now gaining a whole new understanding of how cells divide, allowing us to study a number of diseases we had no access to before. “,” Eric Jarvis saida researcher in neurogenetics at Rockefeller University, and co-author of the study published in to know.
>> Read also: Can we inherit trauma from our ancestors?
Knowledge that could pave the way for new treatments
Our DNA is packaged and compressed into a structure called chromatin; It is the main component of our chromosomes. There are two types of chromatin: urochromatin, which is less condensed and more accessible to genes, which enhances their expression and production of proteins; and heterochromatin, which is more dense, and therefore its DNA is more difficult to access.
The sequencing performed so far has mainly focused on the homozygous portion of the genome, which was much easier to sequence. The heterologous sequences, located behind the centromeres – upon which cell division depends – were thus marked with long ‘N’ sequences, for an ‘unknown base’.
But thanks to the development of new technologies that provide longer readouts for sequencing, researchers have gradually been able to fill in the gaps—for both the human genome and the genomes of other species; Vertebrate Genome Project For example, it made it possible to create the first near-complete reference genomes of 25 animals. In particular, the T2T group was able to count on Merfin . Toola new algorithm that can detect and correct any errors in genetic sequencing.
>> Read also: Biodiversity: genetic sequencing to save endangered species
The T2T conjugate can now present a complete sequence of 3.055 billion base pairs of DNA, called T2T-CHM13. This reference not only includes gapless (telomere to telomere) clusters of 22 X chromosome genotypes, but also corrects errors made in previous references. In total, ” Approximately 200 million base pairs of sequences contain 1956 genetic predictions, 99 of which are predicted to code for proteins. “, Presented, The team says in to know.
The researchers who contributed to this project hope that their work will advance research into heterozygous genome-related diseases, particularly cancer. Cancer cells divide more intensively when heterochromatin genes are overexpressed in the centromere; Thus, a complete understanding of the centromere genome now could pave the way for new therapies.
“Subtly charming problem solver. Extreme tv enthusiast. Web scholar. Evil beer expert. Music nerd. Food junkie.”