Algorithm can predict how genes regulate specific cells

Scientists at the University of Illinois (UIC) in Chicago, USA, have developed software that can help identify gene regulators efficiently. The research results were published June 30 in the open access scientific journal Genome Research. The developed program uses an algorithm of machine learning to predict which transcription factors are most likely to be active in individual cells.

Schematic overview of the BITFAM machine learning system.Source:  Genome Research/International CC 4.0 license

How genes regulate cell

Transcription factors are proteins that bind to DNA and control which genes are turned on or off within a cell. These proteins can help understand and manipulate these signals in the cell and be an effective way to discover new treatments for certain diseases. The difficulty is that there are hundreds of transcription factors in human cells – which can take years of trial and error to identify which are the most active in a given tissue.

When you find out which ones are expressed, or “on“, they can be used as drug targets, turning off, for example, disease-causing gene expression. “One of the challenges in the field is that the same genes can be turned on in one group of cells, but turned off in a different group within the same organ,” explained Jalees Rehman, UIC professor in the departments of Medicine and Pharmacology and Regenerative Medicine, to the Phys website.

“Being able to understand the activity of transcription factors in individual cells would allow researchers to study activity profiles in all major cell types in major organs such as the heart, brain or lungs,” said Rehman.

Machine Learning in Genetics

Dubbed BITFAM (Bayesian Inference Transcription Factor Activity Model) – which stands for “Bayesian Inference Transcription Factor Activity Model” in free translation – the system combines new gene expression profile data collected from the RNA sequencing of a single cell, with existing biological data. With this information, the system runs numerous simulations, until it finds the ideal fit and predicts the activity of each transcription factor in the cell.

The university’s researchers, led by Rehman and Yang Dai, a UIC professor in the Department of Bioengineering, tested the system on cells in the lung, heart and brain tissue. “Our approach not only identifies significant transcription factor activities, but also provides valuable insight into the underlying transcription factor regulatory mechanisms,” said Shang Gao, the study’s first author and doctoral student in the Department of Bioengineering.

“For example, if 80% of the targets for a specific transcription factor are activated inside the cell, that tells us that its activity is high. By providing data like this for each transcription factor in the cell, the model can give researchers a It’s a good idea which ones to look first when exploring new drug targets to work in this type of cell,” described Gao.

System checks percentage of transcription factors activated in cells.System checks percentage of transcription factors activated in cells.Source:  Freepik

According to the authors, the new software it is publicly available and can be applied widely because users have the flexibility to combine it with additional analytical methods suitable for their studies – such as finding new targets for certain drugs.

Applications of the new algorithm

According to Dai, the new approach can be used to develop important biological hypotheses regarding regulatory transcription factors in cells related to a wide range of scientific hypotheses and topics. “This will allow us to obtain insights about the biological functions of cells in many tissues,” Dai said.

Rehman, whose research focuses on the mechanisms of inflammation in vascular systems, believes that a relevant application for his laboratory is the use of the new software to focus on transcription factors that drive disease in specific cell types. “For example, we would like to understand if there is transcription factor activity that distinguishes a healthy immune cell response from an unhealthy one, as in the case of conditions like a covid-19,” explained the professor.

Leave a Comment