A Look at How AI Can Potentially Help Cure Diseases and Teach Us About How Cells Works
A new AI-powered virtual cell modeling system pioneered by the Chan Zuckerberg Initiative (CZI) can teach us how cells work and potentially help cure diseases. It is being hailed as a major breakthrough in the medical sector in our understanding of diseases.
One of the key factors in our understanding of how diseases work is in cells, the smallest living units. However, very little is currently known about them. For example, how lipids, proteins, DNA, and billions of other biomolecules assemble to work as a single cell is still unclear.
We also don't yet understand how the many different cell types interact with our bodies. Similarly, very little is known about what it takes for organs, tissues, and cells to become diseases and what processes are needed for them to become healthy again.
Thanks to ground-breaking new tools powered by AI (artificial intelligence), some of the important questions we need to find the answers to may soon be answered by researchers who can correctly utilize these tools. If they can harness the untapped powers of these cutting-edge technologies, the health and well-being of human beings across the globe will significantly improve.
Try to visualize what it would be like if we could truly develop an AI-powered system representing every cell type and state. A virtual cell could mimic any cell inside the human body with regard to appearance and known characteristics – from the cardiomyocytes that keep our hearts ticking over to the cones and rods that detect light in our retinas.
The AI-powered tool could be used by scientists to accurately guess how these cells might respond to certain stimuli in specific conditions. For example, how someone's body may react to a new medicine, what occurs at a cellular level when a child is born with a rare disease, or how an immune cell may respond to an infection.
Important areas like patient treatment decisions, diagnosis, and scientific discovery would become safer, faster, and more efficient than ever. The Chan Zuckerberg Initiative is a pioneering company helping develop such a computing infrastructure by generating all the necessary scientific data required. It will arm scientists with the tools needed to exploit the new advances in artificial intelligence and could potentially put an end to diseases.
Scientific data
The structure of almost every known protein has been predicted, thanks to the vast amounts of scientific data already gathered over the past five decades, combined with advancements in the field of artificial intelligence. AlphaFold was trained by DeepMind on extensive volumes of carefully collected invaluable data from the past half-century. It solved the mystery of protein structure in as little as five years!
Meta also developed another state-of-the-art artificial intelligence-powered tool called ESM. This protein language model was trained on over 60 million protein sequences, not on words, and it's currently being used for numerous applications, such as the effects of mutations from single sequences and the ability to predict protein structures.
A virtual modeling system will also need even more data volumes. To generate and interpret data about cells and their components, build the tools required to integrate the data collected, and then make them more accessible to experts in the field to study and build upon, the Chan Zuckerberg Initiative has been supporting researchers for over five years to make this all possible and will continue to do so going forward.
A global group of researchers built a reference map containing all types of body cells. CZI's Biohub in San Francisco is currently creating whole-organism cell atlases. Their combined research and tireless efforts have led to a draft of the Human Cell Atlas (HCA), which logs and maps human body cell types from development to adulthood. CZI and the Biohub in SF will also come together with OpenCell, which currently charts the locations of different proteins in human cells.
Researchers can also now explore huge data volumes about our body's cells and genes by using artificial intelligence machine learning (ML) tools like scGPT and Geneformer, which also includes data from another open-source software platform called CellxGene. It was developed by science and technology teams at CZI to speed up the research into single-cells.
In a similar move, their science and tech teams and Imaging Institute are working closely with experts in the ML field to pioneer automated annotations of microscopy data. Thanks to this innovative new tool, the data processing process will take just weeks instead of months or years.
The gathered data will be made widely available to ensure that any scientific breakthroughs will benefit everyone. It will also include furthering our understanding of how diseases that develop in childhood work in terms of cellular mechanisms by incorporating pediatric data in the HCA (Human Cell Atlas).
Researchers who gather data about cells from multiple ethnicities, including Latino, Southeast Asia, Black, and Indigenous people, among other diverse groups of less studied ancestral, ethnic, and racial backgrounds, will also be supported, thanks to the Ancestry Networks grants.
Researchers have already made some interesting findings using these carefully gathered data sets. One such discovery identified the respiratory cells that are more vulnerable to SARS CoV-2, and another found that a type of cell scientists hadn't yet come across linked a broken gene to cystic fibrosis.
Others have used data to find new ways to splice genes, which can potentially modify mutations that cause disease in specific human cells. The researchers' findings are just the beginning of developing treatments for diseases. It's also believed that artificial intelligence will considerably accelerate the rate of discoveries by researchers in this field in the not-too-distant future.
Computing
A powerful computing cluster with 1000+ H100 GPUs is currently being built to create a virtual cell that will give researchers the ability to develop cutting-edge new artificial intelligence models. They will be trained on vast amounts of data about biomolecules and cells, including those generated by their scientific institutes.
Hopefully, scientists will soon be able to simulate every human cell in diseased and healthy states and probe those models to discover how mysterious biological phenomena may transpire. It will include the early development stages of cell materialization, how they interact around our bodies, and how diseases causing transformations might impact them.
Computing clusters used in this way will not be anywhere near as big as those currently used for commercial products in the private sector. However, as soon as it's fully operational, it will become one of the world's foremost artificial intelligence clusters in the nonprofit scientific research field.
It will be a fundamental tool for teams of academics currently searching for ways to use these data sets but are currently limited in their abilities due to accessibility and cost. These digital cell models and the applications and data that come with them will be freely and readily accessible to researchers anywhere in the world.
Individuals involved
Developing such an advanced computing cluster, gathering huge volumes of data, and implementing pioneering artificial intelligence-powered software and systems for biology combines several branches of important learning disciplines. It's a collective endeavor that defines their work.
Experts and researchers from various institutions and disciplines have collaborated to tackle some of the riskiest challenges that could never have been unlocked in traditional academic settings. In advanced projects such as CELLxGENE, global researchers have conceived a single-cell corpus.
It goes to show how successful collaborators and shared resources can help contribute to taking this field to exciting new heights. It can teach us how cells work to treat and potentially find new cures for diseases. The Chan Zuckerberg Institute was founded in 2016, and since then, they have remained committed to their mission: manage, prevent, or cure all known diseases before the turn of the next century by assisting the scientific community.
They feel this objective is easily attainable and will hopefully progress to even more advanced levels if the leaders involved (e.g., technologists, scientists, and researchers) continue on their collaborative path while using emerging AI-powered tools to their advantage. With the mysteries of human cells solved, their work could one day put an end to many diseases.