More stories

  • in

    Exploring the cellular neighborhood

    Cells rely on complex molecular machines composed of protein assemblies to perform essential functions such as energy production, gene expression, and protein synthesis. To better understand how these machines work, scientists capture snapshots of them by isolating proteins from cells and using various methods to determine their structures. However, isolating proteins from cells also removes them from the context of their native environment, including protein interaction partners and cellular location.

    Recently, cryogenic electron tomography (cryo-ET) has emerged as a way to observe proteins in their native environment by imaging frozen cells at different angles to obtain three-dimensional structural information. This approach is exciting because it allows researchers to directly observe how and where proteins associate with each other, revealing the cellular neighborhood of those interactions within the cell.

    With the technology available to image proteins in their native environment, MIT graduate student Barrett Powell wondered if he could take it one step further: What if molecular machines could be observed in action? In a paper published March 8 in Nature Methods, Powell describes the method he developed, called tomoDRGN, for modeling structural differences of proteins in cryo-ET data that arise from protein motions or proteins binding to different interaction partners. These variations are known as structural heterogeneity. 

    Although Powell had joined the lab of MIT associate professor of biology Joey Davis as an experimental scientist, he recognized the potential impact of computational approaches in understanding structural heterogeneity within a cell. Previously, the Davis Lab developed a related methodology named cryoDRGN to understand structural heterogeneity in purified samples. As Powell and Davis saw cryo-ET rising in prominence in the field, Powell took on the challenge of re-imagining this framework to work in cells.

    When solving structures with purified samples, each particle is imaged only once. By contrast, cryo-ET data is collected by imaging each particle more than 40 times from different angles. That meant tomoDRGN needed to be able to merge the information from more than 40 images, which was where the project hit a roadblock: the amount of data led to an information overload.

    To address this, Powell successfully rebuilt the cryoDRGN model to prioritize only the highest-quality data. When imaging the same particle multiple times, radiation damage occurs. The images acquired earlier, therefore, tend to be of higher quality because the particles are less damaged.

    “By excluding some of the lower-quality data, the results were actually better than using all of the data — and the computational performance was substantially faster,” Powell says.

    Just as Powell was beginning work on testing his model, he had a stroke of luck: The authors of a groundbreaking new study that visualized, for the first time, ribosomes inside cells at near-atomic resolution, shared their raw data on the Electric Microscopy Public Image Archive (EMPIAR). This dataset was an exemplary test case for Powell, through which he demonstrated that tomoDRGN could uncover structural heterogeneity within cryo-ET data. 

    According to Powell, one exciting result is what tomoDRGN found surrounding a subset of ribosomes in the EMPIAR dataset. Some of the ribosomal particles were associated with a bacterial cell membrane and engaged in a process called cotranslational translocation. This occurs when a protein is being simultaneously synthesized and transported across a membrane. Researchers can use this result to make new hypotheses about how the ribosome functions with other protein machinery integral to transporting proteins outside of the cell, now guided by a structure of the complex in its native environment. 

    After seeing that tomoDRGN could resolve structural heterogeneity from a structurally diverse dataset, Powell was curious: How small of a population could tomoDRGN identify? For that test, he chose a protein named apoferritin, which is a commonly used benchmark for cryo-ET and is often treated as structurally homogeneous. Ferritin is a protein used for iron storage and is referred to as apoferritin when it lacks iron.

    Surprisingly, in addition to the expected particles, tomoDRGN revealed a minor population of ferritin particles — with iron bound — making up just 2 percent of the dataset, that was not previously reported. This result further demonstrated tomoDRGN’s ability to identify structural states that occur so infrequently that they would be averaged out of a 3D reconstruction. 

    Powell and other members of the Davis Lab are excited to see how tomoDRGN can be applied to further ribosomal studies and to other systems. Davis works on understanding how cells assemble, regulate, and degrade molecular machines, so the next steps include exploring ribosome biogenesis within cells in greater detail using this new tool.

    “What are the possible states that we may be losing during purification?” Davis asks. “Perhaps more excitingly, we can look at how they localize within the cell and what partners and protein complexes they may be interacting with.” More

  • in

    A more effective experimental design for engineering a cell into a new state

    A strategy for cellular reprogramming involves using targeted genetic interventions to engineer a cell into a new state. The technique holds great promise in immunotherapy, for instance, where researchers could reprogram a patient’s T-cells so they are more potent cancer killers. Someday, the approach could also help identify life-saving cancer treatments or regenerative therapies that repair disease-ravaged organs.

    But the human body has about 20,000 genes, and a genetic perturbation could be on a combination of genes or on any of the over 1,000 transcription factors that regulate the genes. Because the search space is vast and genetic experiments are costly, scientists often struggle to find the ideal perturbation for their particular application.   

    Researchers from MIT and Harvard University developed a new, computational approach that can efficiently identify optimal genetic perturbations based on a much smaller number of experiments than traditional methods.

    Their algorithmic technique leverages the cause-and-effect relationship between factors in a complex system, such as genome regulation, to prioritize the best intervention in each round of sequential experiments.

    The researchers conducted a rigorous theoretical analysis to determine that their technique did, indeed, identify optimal interventions. With that theoretical framework in place, they applied the algorithms to real biological data designed to mimic a cellular reprogramming experiment. Their algorithms were the most efficient and effective.

    “Too often, large-scale experiments are designed empirically. A careful causal framework for sequential experimentation may allow identifying optimal interventions with fewer trials, thereby reducing experimental costs,” says co-senior author Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS) who is also co-director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS) and Institute for Data, Systems and Society (IDSS).

    Joining Uhler on the paper, which appears today in Nature Machine Intelligence, are lead author Jiaqi Zhang, a graduate student and Eric and Wendy Schmidt Center Fellow; co-senior author Themistoklis P. Sapsis, professor of mechanical and ocean engineering at MIT and a member of IDSS; and others at Harvard and MIT.

    Active learning

    When scientists try to design an effective intervention for a complex system, like in cellular reprogramming, they often perform experiments sequentially. Such settings are ideally suited for the use of a machine-learning approach called active learning. Data samples are collected and used to learn a model of the system that incorporates the knowledge gathered so far. From this model, an acquisition function is designed — an equation that evaluates all potential interventions and picks the best one to test in the next trial.

    This process is repeated until an optimal intervention is identified (or resources to fund subsequent experiments run out).

    “While there are several generic acquisition functions to sequentially design experiments, these are not effective for problems of such complexity, leading to very slow convergence,” Sapsis explains.

    Acquisition functions typically consider correlation between factors, such as which genes are co-expressed. But focusing only on correlation ignores the regulatory relationships or causal structure of the system. For instance, a genetic intervention can only affect the expression of downstream genes, but a correlation-based approach would not be able to distinguish between genes that are upstream or downstream.

    “You can learn some of this causal knowledge from the data and use that to design an intervention more efficiently,” Zhang explains.

    The MIT and Harvard researchers leveraged this underlying causal structure for their technique. First, they carefully constructed an algorithm so it can only learn models of the system that account for causal relationships.

    Then the researchers designed the acquisition function so it automatically evaluates interventions using information on these causal relationships. They crafted this function so it prioritizes the most informative interventions, meaning those most likely to lead to the optimal intervention in subsequent experiments.

    “By considering causal models instead of correlation-based models, we can already rule out certain interventions. Then, whenever you get new data, you can learn a more accurate causal model and thereby further shrink the space of interventions,” Uhler explains.

    This smaller search space, coupled with the acquisition function’s special focus on the most informative interventions, is what makes their approach so efficient.

    The researchers further improved their acquisition function using a technique known as output weighting, inspired by the study of extreme events in complex systems. This method carefully emphasizes interventions that are likely to be closer to the optimal intervention.

    “Essentially, we view an optimal intervention as an ‘extreme event’ within the space of all possible, suboptimal interventions and use some of the ideas we have developed for these problems,” Sapsis says.    

    Enhanced efficiency

    They tested their algorithms using real biological data in a simulated cellular reprogramming experiment. For this test, they sought a genetic perturbation that would result in a desired shift in average gene expression. Their acquisition functions consistently identified better interventions than baseline methods through every step in the multi-stage experiment.

    “If you cut the experiment off at any stage, ours would still be more efficient than the baselines. This means you could run fewer experiments and get the same or better results,” Zhang says.

    The researchers are currently working with experimentalists to apply their technique toward cellular reprogramming in the lab.

    Their approach could also be applied to problems outside genomics, such as identifying optimal prices for consumer products or enabling optimal feedback control in fluid mechanics applications.

    In the future, they plan to enhance their technique for optimizations beyond those that seek to match a desired mean. In addition, their method assumes that scientists already understand the causal relationships in their system, but future work could explore how to use AI to learn that information, as well.

    This work was funded, in part, by the Office of Naval Research, the MIT-IBM Watson AI Lab, the MIT J-Clinic for Machine Learning and Health, the Eric and Wendy Schmidt Center at the Broad Institute, a Simons Investigator Award, the Air Force Office of Scientific Research, and a National Science Foundation Graduate Fellowship. More

  • in

    Making sense of cell fate

    Despite the proliferation of novel therapies such as immunotherapy or targeted therapies, radiation and chemotherapy remain the frontline treatment for cancer patients. About half of all patients still receive radiation and 60-80 percent receive chemotherapy.

    Both radiation and chemotherapy work by damaging DNA, taking advantage of a vulnerability specific to cancer cells. Healthy cells are more likely to survive radiation and chemotherapy since their mechanisms for identifying and repairing DNA damage are intact. In cancer cells, these repair mechanisms are compromised by mutations. When cancer cells cannot adequately respond to the DNA damage caused by radiation and chemotherapy, ideally, they undergo apoptosis or die by other means.

    However, there is another fate for cells after DNA damage: senescence — a state where cells survive, but stop dividing. Senescent cells’ DNA has not been damaged enough to induce apoptosis but is too damaged to support cell division. While senescent cancer cells themselves are unable to proliferate and spread, they are bad actors in the fight against cancer because they seem to enable other cancer cells to develop more aggressively. Although a cancer cell’s fate is not apparent until a few days after treatment, the decision to survive, die, or enter senescence is made much earlier. But, precisely when and how that decision is made has not been well understood.

    In an open-access study of ovarian and osteosarcoma cancer cells appearing July 19 in Cell Systems, MIT researchers show that cell signaling proteins commonly associated with cell proliferation and apoptosis instead commit cancer cells to senescence within 12 hours of treatment with low doses of certain kinds of chemotherapy.

    “When it comes to treating cancer, this study underscores that it’s important not to think too linearly about cell signaling,” says Michael Yaffe, who is a David H. Koch Professor of Science at MIT, the director of the MIT Center for Precision Cancer Medicine, a member of MIT’s Koch Institute for Integrative Cancer Research, and the senior author of the study. “If you assume that a particular treatment will always affect cancer cell signaling in the same way — you may be setting yourself up for many surprises, and treating cancers with the wrong combination of drugs.”

    Using a combination of experiments with cancer cells and computational modeling, the team investigated the cell signaling mechanisms that prompt cancer cells to enter senescence after treatment with a commonly used anti-cancer agent. Their efforts singled out two protein kinases and a component of the AP-1 transcription factor complex as highly associated with the induction of senescence after DNA damage, despite the well-established roles for all of these molecules in promoting cell proliferation in cancer.

    The researchers treated cancer cells with low and high doses of doxorubicin, a chemotherapy that interferes with the function with topoisomerase II, an enzyme that breaks and then repairs DNA strands during replication to fix tangles and other topological problems.

    By measuring the effects of DNA damage on single cells at several time points ranging from six hours to four days after the initial exposure, the team created two datasets. In one dataset, the researchers tracked cell fate over time. For the second set, researchers measured relative cell signaling activity levels across a variety of proteins associated with responses to DNA damage or cellular stress, determination of cell fate, and progress through cell growth and division.

    The two datasets were used to build a computational model that identifies correlations between time, dosage, signal, and cell fate. The model identified the activities of the MAP kinases Erk and JNK, and the transcription factor c-Jun as key components of the AP-1 protein likewise understood to involved in the induction of senescence. The researchers then validated these computational findings by showing that inhibition of JNK and Erk after DNA damage successfully prevented cells from entering senescence.

    The researchers leveraged JNK and Erk inhibition to pinpoint exactly when cells made the decision to enter senescence. Surprisingly, they found that the decision to enter senescence was made within 12 hours of DNA damage, even though it took days to actually see the senescent cells accumulate. The team also found that with the passage of more time, these MAP kinases took on a different function: promoting the secretion of proinflammatory proteins called cytokines that are responsible for making other cancer cells proliferate and develop resistance to chemotherapy.

    “Proteins like cytokines encourage ‘bad behavior’ in neighboring tumor cells that lead to more aggressive cancer progression,” says Tatiana Netterfield, a graduate student in the Yaffe lab and the lead author of the study. “Because of this, it is thought that senescent cells that stay near the tumor for long periods of time are detrimental to treating cancer.”

    This study’s findings apply to cancer cells treated with a commonly used type of chemotherapy that stalls DNA replication after repair. But more broadly, the study emphasizes that “when treating cancer, it’s extremely important to understand the molecular characteristics of cancer cells and the contextual factors such as time and dosing that determine cell fate,” explains Netterfield.

    The study, however, has more immediate implications for treatments that are already in use. One class of Erk inhibitors, MEK inhibitors, are used in the clinic with the expectation that they will curb cancer growth.

    “We must be cautious about administering MEK inhibitors together with chemotherapies,” says Yaffe. “The combination may have the unintended effect of driving cells into proliferation, rather than senescence.”

    In future work, the team will perform studies to understand how and why individual cells choose to proliferate instead of enter senescence. Additionally, the team is employing next-generation sequencing to understand which genes c-Jun is regulating in order to push cells toward senescence.

    This study was funded, in part, by the Charles and Marjorie Holloway Foundation and the MIT Center for Precision Cancer Medicine. More

  • in

    New leadership at MIT’s Center for Biomedical Innovation

    As it continues in its mission to improve global health through the development and implementation of biomedical innovation, the MIT Center for Biomedical Innovation (CBI) today announced changes to its leadership team: Stacy Springs has been named executive director, and Professor Richard Braatz has joined as the center’s new associate faculty director.

    The change in leadership comes at a time of rapid development in new therapeutic modalities, growing concern over global access to biologic medicines and healthy food, and widespread interest in applying computational tools and multi-disciplinary approaches to address long-standing biomedical challenges.

    “This marks an exciting new chapter for the CBI,” says faculty director Anthony J. Sinskey, professor of biology, who cofounded CBI in 2005. “As I look back at almost 20 years of CBI history, I see an exponential growth in our activities, educational offerings, and impact.”

    The center’s collaborative research model accelerates innovation in biotechnology and biomedical research, drawing on the expertise of faculty and researchers in MIT’s schools of Engineering and Science, the MIT Schwarzman College of Computing, and the MIT Sloan School of Management.

    Springs steps into the role of executive director having previously served as senior director of programs for CBI and as executive director of CBI’s Biomanufacturing Program and its Consortium on Adventitious Agent Contamination in Biomanufacturing (CAACB). She succeeds Gigi Hirsch, who founded the NEW Drug Development ParadIGmS (NEWDIGS) Initiative at CBI in 2009. Hirsch and NEWDIGS have now moved to Tufts Medical Center, establishing a headquarters at the new Center for Biomedical System Design within the Institute for Clinical Research and Health Policy Studies there.

    Braatz, a chemical engineer whose work is informed by mathematical modeling and computational techniques, conducts research in process data analytics, design, and control of advanced manufacturing systems.

    “It’s been great to interact with faculty from across the Institute who have complementary expertise,” says Braatz, the Edwin R. Gilliland Professor in the Department of Chemical Engineering. “Participating in CBI’s workshops has led to fruitful partnerships with companies in tackling industry-wide challenges.”

    CBI is housed under the Institute for Data Systems and Society and, specifically, the Sociotechnical Systems Research Center in the MIT Schwarzman College of Computing. CBI is home to two biomanufacturing consortia: the CAACB and the Biomanufacturing Consortium (BioMAN). Through these precompetitive collaborations, CBI researchers work with biomanufacturers and regulators to advance shared interests in biomanufacturing.

    In addition, CBI researchers are engaged in several sponsored research programs focused on integrated continuous biomanufacturing capabilities for monoclonal antibodies and vaccines, analytical technologies to measure quality and safety attributes of a variety of biologics, including gene and cell therapies, and rapid-cycle development of virus-like particle vaccines for SARS-CoV-2.

    In another significant initiative, CBI researchers are applying data analytics strategies to biomanufacturing problems. “In our smart data analytics project, we are creating new decision support tools and algorithms for biomanufacturing process control and plant-level decision-making. Further, we are leveraging machine learning and natural language processing to improve post-market surveillance studies,” says Springs.

    CBI is also working on advanced manufacturing for cell and gene therapies, among other new modalities, and is a part of the Singapore-MIT Alliance for Research and Technology – Critical Analytics for Manufacturing Personalized-Medicine (SMART CAMP). SMART CAMP is an international research effort focused on developing the analytical tools and biological understanding of critical quality attributes that will enable the manufacture and delivery of improved cell therapies to patients.

    “This is a crucial time for biomanufacturing and for innovation across the health-care value chain. The collaborative efforts of MIT researchers and consortia members will drive fundamental discovery and inform much-needed progress in industry,” says MIT Vice President for Research Maria Zuber.

    “CBI has a track record of engaging with health-care ecosystem challenges. I am confident that under the new leadership, it will continue to inspire MIT, the United States, and the entire world to improve the health of all people,” adds Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing. More

  • in

    New CRISPR-based map ties every human gene to its function

    The Human Genome Project was an ambitious initiative to sequence every piece of human DNA. The project drew together collaborators from research institutions around the world, including MIT’s Whitehead Institute for Biomedical Research, and was finally completed in 2003. Now, over two decades later, MIT Professor Jonathan Weissman and colleagues have gone beyond the sequence to present the first comprehensive functional map of genes that are expressed in human cells. The data from this project, published online June 9 in Cell, ties each gene to its job in the cell, and is the culmination of years of collaboration on the single-cell sequencing method Perturb-seq.

    The data are available for other scientists to use. “It’s a big resource in the way the human genome is a big resource, in that you can go in and do discovery-based research,” says Weissman, who is also a member of the Whitehead Institute and an investigator with the Howard Hughes Medical Institute. “Rather than defining ahead of time what biology you’re going to be looking at, you have this map of the genotype-phenotype relationships and you can go in and screen the database without having to do any experiments.”

    The screen allowed the researchers to delve into diverse biological questions. They used it to explore the cellular effects of genes with unknown functions, to investigate the response of mitochondria to stress, and to screen for genes that cause chromosomes to be lost or gained, a phenotype that has proved difficult to study in the past. “I think this dataset is going to enable all sorts of analyses that we haven’t even thought up yet by people who come from other parts of biology, and suddenly they just have this available to draw on,” says former Weissman Lab postdoc Tom Norman, a co-senior author of the paper.

    Pioneering Perturb-seq

    The project takes advantage of the Perturb-seq approach that makes it possible to follow the impact of turning on or off genes with unprecedented depth. This method was first published in 2016 by a group of researchers including Weissman and fellow MIT professor Aviv Regev, but could only be used on small sets of genes and at great expense.

    The massive Perturb-seq map was made possible by foundational work from Joseph Replogle, an MD-PhD student in Weissman’s lab and co-first author of the present paper. Replogle, in collaboration with Norman, who now leads a lab at Memorial Sloan Kettering Cancer Center; Britt Adamson, an assistant professor in the Department of Molecular Biology at Princeton University; and a group at 10x Genomics, set out to create a new version of Perturb-seq that could be scaled up. The researchers published a proof-of-concept paper in Nature Biotechnology in 2020. 

    The Perturb-seq method uses CRISPR-Cas9 genome editing to introduce genetic changes into cells, and then uses single-cell RNA sequencing to capture information about the RNAs that are expressed resulting from a given genetic change. Because RNAs control all aspects of how cells behave, this method can help decode the many cellular effects of genetic changes.

    Since their initial proof-of-concept paper, Weissman, Regev, and others have used this sequencing method on smaller scales. For example, the researchers used Perturb-seq in 2021 to explore how human and viral genes interact over the course of an infection with HCMV, a common herpesvirus.

    In the new study, Replogle and collaborators including Reuben Saunders, a graduate student in Weissman’s lab and co-first author of the paper, scaled up the method to the entire genome. Using human blood cancer cell lines as well noncancerous cells derived from the retina, he performed Perturb-seq across more than 2.5 million cells, and used the data to build a comprehensive map tying genotypes to phenotypes.

    Delving into the data

    Upon completing the screen, the researchers decided to put their new dataset to use and examine a few biological questions. “The advantage of Perturb-seq is it lets you get a big dataset in an unbiased way,” says Tom Norman. “No one knows entirely what the limits are of what you can get out of that kind of dataset. Now, the question is, what do you actually do with it?”

    The first, most obvious application was to look into genes with unknown functions. Because the screen also read out phenotypes of many known genes, the researchers could use the data to compare unknown genes to known ones and look for similar transcriptional outcomes, which could suggest the gene products worked together as part of a larger complex.

    The mutation of one gene called C7orf26 in particular stood out. Researchers noticed that genes whose removal led to a similar phenotype were part of a protein complex called Integrator that played a role in creating small nuclear RNAs. The Integrator complex is made up of many smaller subunits — previous studies had suggested 14 individual proteins — and the researchers were able to confirm that C7orf26 made up a 15th component of the complex.

    They also discovered that the 15 subunits worked together in smaller modules to perform specific functions within the Integrator complex. “Absent this thousand-foot-high view of the situation, it was not so clear that these different modules were so functionally distinct,” says Saunders.

    Another perk of Perturb-seq is that because the assay focuses on single cells, the researchers could use the data to look at more complex phenotypes that become muddied when they are studied together with data from other cells. “We often take all the cells where ‘gene X’ is knocked down and average them together to look at how they changed,” Weissman says. “But sometimes when you knock down a gene, different cells that are losing that same gene behave differently, and that behavior may be missed by the average.”

    The researchers found that a subset of genes whose removal led to different outcomes from cell to cell were responsible for chromosome segregation. Their removal was causing cells to lose a chromosome or pick up an extra one, a condition known as aneuploidy. “You couldn’t predict what the transcriptional response to losing this gene was because it depended on the secondary effect of what chromosome you gained or lost,” Weissman says. “We realized we could then turn this around and create this composite phenotype looking for signatures of chromosomes being gained and lost. In this way, we’ve done the first genome-wide screen for factors that are required for the correct segregation of DNA.”

    “I think the aneuploidy study is the most interesting application of this data so far,” Norman says. “It captures a phenotype that you can only get using a single-cell readout. You can’t go after it any other way.”

    The researchers also used their dataset to study how mitochondria responded to stress. Mitochondria, which evolved from free-living bacteria, carry 13 genes in their genomes. Within the nuclear DNA, around 1,000 genes are somehow related to mitochondrial function. “People have been interested for a long time in how nuclear and mitochondrial DNA are coordinated and regulated in different cellular conditions, especially when a cell is stressed,” Replogle says.

    The researchers found that when they perturbed different mitochondria-related genes, the nuclear genome responded similarly to many different genetic changes. However, the mitochondrial genome responses were much more variable. 

    “There’s still an open question of why mitochondria still have their own DNA,” said Replogle. “A big-picture takeaway from our work is that one benefit of having a separate mitochondrial genome might be having localized or very specific genetic regulation in response to different stressors.”

    “If you have one mitochondria that’s broken, and another one that is broken in a different way, those mitochondria could be responding differentially,” Weissman says.

    In the future, the researchers hope to use Perturb-seq on different types of cells besides the cancer cell line they started in. They also hope to continue to explore their map of gene functions, and hope others will do the same. “This really is the culmination of many years of work by the authors and other collaborators, and I’m really pleased to see it continue to succeed and expand,” says Norman. More

  • in

    Probing how proteins pair up inside cells

    Despite its minute size, a single cell contains billions of molecules that bustle around and bind to one another, carrying out vital functions. The human genome encodes about 20,000 proteins, most of which interact with partner proteins to mediate upwards of 400,000 distinct interactions. These partners don’t just latch onto one another haphazardly; they only bind to very specific companions that they must recognize inside the crowded cell. If they create the wrong pairings — or even the right pairings at the wrong place or wrong time — cancer or other diseases can ensue. Scientists are hard at work investigating these protein-protein relationships, in order to understand how they work, and potentially create drugs that disrupt or mimic them to treat disease.

    The average human protein is composed of approximately 400 building blocks called amino acids, which are strung together and folded into a complex 3D structure. Within this long string of building blocks, some proteins contain stretches of four to six amino acids called short linear motifs (SLiMs), which mediate protein-protein interactions. Despite their simplicity and small size, SLiMs and their binding partners facilitate key cellular processes. However, it’s been historically difficult to devise experiments to probe how SLiMs recognize their specific binding partners.

    To address this problem, a group led by Theresa Hwang PhD ’21 designed a screening method to understand how SLiMs selectively bind to certain proteins, and even distinguish between those with similar structures. Using the detailed information they gleaned from studying these interactions, the researchers created their own synthetic molecule capable of binding extremely tightly to a protein called ENAH, which is implicated in cancer metastasis. The team shared their findings in a pair of eLife studies, one published on Dec. 2, 2021, and the other published Jan. 25.

    “The ability to test hundreds of thousands of potential SLiMs for binding provides a powerful tool to explore why proteins prefer specific SLiM partners over others,” says Amy Keating, professor of biology and biological engineering and the senior author on both studies. “As we gain an understanding of the tricks that a protein uses to select its partners, we can apply these in protein design to make our own binders to modulate protein function for research or therapeutic purposes.”

    Most existing screens for SLiMs simply select for short, tight binders, while neglecting SLiMs that don’t grip their partner proteins quite as strongly. To survey SLiMs with a wide range of binding affinities, Keating, Hwang, and their colleagues developed their own screen called MassTitr.

    The researchers also suspected that the amino acids on either side of the SLiM’s core four-to-six amino acid sequence might play an underappreciated role in binding. To test their theory, they used MassTitr to screen the human proteome in longer chunks comprised of 36 amino acids, in order to see which “extended” SLiMs would associate with the protein ENAH.

    ENAH, sometimes referred to as Mena, helps cells to move. This ability to migrate is critical for healthy cells, but cancer cells can co-opt it to spread. Scientists have found that reducing the amount of ENAH decreases the cancer cell’s ability to invade other tissues — suggesting that formulating drugs to disrupt this protein and its interactions could treat cancer.

    Thanks to MassTitr, the team identified 33 SLiM-containing proteins that bound to ENAH — 19 of which are potentially novel binding partners. They also discovered three distinct patterns of amino acids flanking core SLiM sequences that helped the SLiMs bind even tighter to ENAH. Of these extended SLiMs, one found in a protein called PCARE bound to ENAH with the highest known affinity of any SLiM to date.

    Next, the researchers combined a computer program called dTERMen with X-ray crystallography in order understand how and why PCARE binds to ENAH over ENAH’s two nearly identical sister proteins (VASP and EVL). Hwang and her colleagues saw that the amino acids flanking PCARE’s core SliM caused ENAH to change shape slightly when the two made contact, allowing the binding sites to latch onto one another. VASP and EVL, by contrast, could not undergo this structural change, so the PCARE SliM did not bind to either of them as tightly.

    Inspired by this unique interaction, Hwang designed her own protein that bound to ENAH with unprecedented affinity and specificity. “It was exciting that we were able to come up with such a specific binder,” she says. “This work lays the foundation for designing synthetic molecules with the potential to disrupt protein-protein interactions that cause disease — or to help scientists learn more about ENAH and other SLiM-binding proteins.”  

    Ylva Ivarsson, a professor of biochemistry at Uppsala University who was not involved with the study, says that understanding how proteins find their binding partners is a question of fundamental importance to cell function and regulation. The two eLife studies, she explains, show that extended SLiMs play an underappreciated role in determining the affinity and specificity of these binding interactions.

    “The studies shed light on the idea that context matters, and provide a screening strategy for a variety of context-dependent binding interactions,” she says. “Hwang and co-authors have created valuable tools for dissecting the cellular function of proteins and their binding partners. Their approach could even inspire ENAH-specific inhibitors for therapeutic purposes.”

    Hwang’s biggest takeaway from the project is that things are not always as they seem: even short, simple protein segments can play complex roles in the cell. As she puts it: “We should really appreciate SLiMs more.” More