Proteins are an essential component of life and have made headlines over the last year with the new coronavirus pandemic – you’ve probably heard or read about the famous protein spike, responsible for binding the virus to the cells of the human body. Now, a new study by the University of Illinois (UIC) in Chicago, USA, maps the evolutionary history of these molecules formed by amino acids and the interrelationships of their domains over 3.8 billion years.
Gustavo Caetano-Anollés, responsible for the study of the molecular composition of proteins.Source: Fred Zwicky/Reproduction
The study was published in the journal Nature, in the notebook Scientific Reports, on June 8th. Scientist Gustavo Caetano-Anollés, professor of Bioinformatics at the Department of Crop Sciences (Cultivation Science) at the UIC and research leader, he studied the evolution of covid-19 mutations from the early stages of the pandemic and created a timeline – a tiny fraction of what he and doctoral student Fayez Aziz have mapped since the start of the search.
The importance of proteins
At the cellular level, proteins are responsible for almost everything. They are so fundamental that DNA – the genetic material that makes each of us unique – is essentially just a long string of them. And this occurs in animals, plants, fungi, bacteria and even viruses. Just as these groups of organisms evolve and change over time, so do proteins and their components.
“Knowing how and why domains combine in proteins during evolution can help scientists understand and design their activity for medical and bioengineering applications. insights can guide disease management, how to make better vaccines from protein spike of the covid-19 virus,” said Caetano-Anollés in an interview with the Phys website.
Studying the Big Bang of Proteins
Researchers have compiled millions of protein sequences encoded in hundreds of genomes from all taxonomic groups – from higher organisms to microbes. But they focused on structural domains, not entire proteins. “Most proteins are made up of more than one domain. They are compact structural units, or modules, that house specialized functions”, explained Caetano-Anollés – and added: “More importantly, they are the evolution units”.
After sorting the proteins into domains to build evolutionary trees, they set to work creating a network. “We built a series of temporal networks that describe how domains accumulated and how proteins reorganized their domains during evolution,” explained Aziz.
This is the first time that the “domain organization” network has been studied as an evolutionary chronology. “Our research revealed that there is a vast evolving network that describes how domains combine in proteins,” said the doctoral candidate. Each link in the network represents a time when a particular domain was recruited into a protein – usually to perform a new function.
“This fact alone strongly suggests that domain recruitment is a powerful force in nature,” stated Fayez Aziz. The chronology also revealed which domains contributed to important protein functions, for example, tracing the origins of domains responsible for environmental sensing and secondary metabolites (by-products of an organism’s action), or toxins used as defenses in bacteria and plants.
What was discovered by mapping the history of proteins?
The analysis showed that the domains began to combine early in the protein’s evolution, but there were also periods of explosive network growth. For example, there was a “Big Bang” of domain combinations 1.5 billion years ago, coinciding with the emergence of multicellular organisms and eukaryotes – organisms with membrane-bound nuclei, which include humans.
The existence of biological Big Bangs is not new, and Caetano-Anollés’ team previously reported the massive and early origin of metabolism. They recently found it again tracing the history of metabolic networks. Now, the historical record of that Big Bang, describing the evolutionary patchwork of proteins, will provide new tools for understanding their composition.
“This could help to identify, for example, why structural variations and genomic recombinations frequently occur in SARS-CoV-2,” said Caetano-Anollés. He added that protein mapping could help prevent pandemics by dissecting the mechanism by which viral diseases originate. Thus, the study may also help mitigate disease by improving vaccine development when outbreaks occur, such as covid-19.