More stories

  • in

    Enabling AI-driven health advances without sacrificing patient privacy

    There’s a lot of excitement at the intersection of artificial intelligence and health care. AI has already been used to improve disease treatment and detection, discover promising new drugs, identify links between genes and diseases, and more.

    By analyzing large datasets and finding patterns, virtually any new algorithm has the potential to help patients — AI researchers just need access to the right data to train and test those algorithms. Hospitals, understandably, are hesitant to share sensitive patient information with research teams. When they do share data, it’s difficult to verify that researchers are only using the data they need and deleting it after they’re done.

    Secure AI Labs (SAIL) is addressing those problems with a technology that lets AI algorithms run on encrypted datasets that never leave the data owner’s system. Health care organizations can control how their datasets are used, while researchers can protect the confidentiality of their models and search queries. Neither party needs to see the data or the model to collaborate.

    SAIL’s platform can also combine data from multiple sources, creating rich insights that fuel more effective algorithms.

    “You shouldn’t have to schmooze with hospital executives for five years before you can run your machine learning algorithm,” says SAIL co-founder and MIT Professor Manolis Kellis, who co-founded the company with CEO Anne Kim ’16, SM ’17. “Our goal is to help patients, to help machine learning scientists, and to create new therapeutics. We want new algorithms — the best algorithms — to be applied to the biggest possible data set.”

    SAIL has already partnered with hospitals and life science companies to unlock anonymized data for researchers. In the next year, the company hopes to be working with about half of the top 50 academic medical centers in the country.

    Unleashing AI’s full potential

    As an undergraduate at MIT studying computer science and molecular biology, Kim worked with researchers in the Computer Science and Artificial Intelligence Laboratory (CSAIL) to analyze data from clinical trials, gene association studies, hospital intensive care units, and more.

    “I realized there is something severely broken in data sharing, whether it was hospitals using hard drives, ancient file transfer protocol, or even sending stuff in the mail,” Kim says. “It was all just not well-tracked.”

    Kellis, who is also a member of the Broad Institute of MIT and Harvard, has spent years establishing partnerships with hospitals and consortia across a range of diseases including cancers, heart disease, schizophrenia, and obesity. He knew that smaller research teams would struggle to get access to the same data his lab was working with.

    In 2017, Kellis and Kim decided to commercialize technology they were developing to allow AI algorithms to run on encrypted data.

    In the summer of 2018, Kim participated in the delta v startup accelerator run by the Martin Trust Center for MIT Entrepreneurship. The founders also received support from the Sandbox Innovation Fund and the Venture Mentoring Service, and made various early connections through their MIT network.

    To participate in SAIL’s program, hospitals and other health care organizations make parts of their data available to researchers by setting up a node behind their firewall. SAIL then sends encrypted algorithms to the servers where the datasets reside in a process called federated learning. The algorithms crunch the data locally in each server and transmit the results back to a central model, which updates itself. No one — not the researchers, the data owners, or even SAIL —has access to the models or the datasets.

    The approach allows a much broader set of researchers to apply their models to large datasets. To further engage the research community, Kellis’ lab at MIT has begun holding competitions in which it gives access to datasets in areas like protein function and gene expression, and challenges researchers to predict results.

    “We invite machine learning researchers to come and train on last year’s data and predict this year’s data,” says Kellis. “If we see there’s a new type of algorithm that is performing best in these community-level assessments, people can adopt it locally at many different institutions and level the playing field. So, the only thing that matters is the quality of your algorithm rather than the power of your connections.”

    By enabling a large number of datasets to be anonymized into aggregate insights, SAIL’s technology also allows researchers to study rare diseases, in which small pools of relevant patient data are often spread out among many institutions. That has historically made the data difficult to apply AI models to.

    “We’re hoping that all of these datasets will eventually be open,” Kellis says. “We can cut across all the silos and enable a new era where every patient with every rare disorder across the entire world can come together in a single keystroke to analyze data.”

    Enabling the medicine of the future

    To work with large amounts of data around specific diseases, SAIL has increasingly sought to partner with patient associations and consortia of health care groups, including an international health care consulting company and the Kidney Cancer Association. The partnerships also align SAIL with patients, the group they’re most trying to help.

    Overall, the founders are happy to see SAIL solving problems they faced in their labs for researchers around the world.

    “The right place to solve this is not an academic project. The right place to solve this is in industry, where we can provide a platform not just for my lab but for any researcher,” Kellis says. “It’s about creating an ecosystem of academia, researchers, pharma, biotech, and hospital partners. I think it’s the blending all of these different areas that will make that vision of medicine of the future become a reality.” More

  • in

    3 Questions: Kalyan Veeramachaneni on hurdles preventing fully automated machine learning

    The proliferation of big data across domains, from banking to health care to environmental monitoring, has spurred increasing demand for machine learning tools that help organizations make decisions based on the data they gather.

    That growing industry demand has driven researchers to explore the possibilities of automated machine learning (AutoML), which seeks to automate the development of machine learning solutions in order to make them accessible for nonexperts, improve their efficiency, and accelerate machine learning research. For example, an AutoML system might enable doctors to use their expertise interpreting electroencephalography (EEG) results to build a model that can predict which patients are at higher risk for epilepsy — without requiring the doctors to have a background in data science.

    Yet, despite more than a decade of work, researchers have been unable to fully automate all steps in the machine learning development process. Even the most efficient commercial AutoML systems still require a prolonged back-and-forth between a domain expert, like a marketing manager or mechanical engineer, and a data scientist, making the process inefficient.

    Kalyan Veeramachaneni, a principal research scientist in the MIT Laboratory for Information and Decision Systems who has been studying AutoML since 2010, has co-authored a paper in the journal ACM Computing Surveys that details a seven-tiered schematic to evaluate AutoML tools based on their level of autonomy.

    A system at level zero has no automation and requires a data scientist to start from scratch and build models by hand, while a tool at level six is completely automated and can be easily and effectively used by a nonexpert. Most commercial systems fall somewhere in the middle.

    Veeramachaneni spoke with MIT News about the current state of AutoML, the hurdles that prevent truly automatic machine learning systems, and the road ahead for AutoML researchers.

    Q: How has automatic machine learning evolved over the past decade, and what is the current state of AutoML systems?

    A: In 2010, we started to see a shift, with enterprises wanting to invest in getting value out of their data beyond just business intelligence. So then came the question, maybe there are certain things in the development of machine learning-based solutions that we can automate? The first iteration of AutoML was to make our own jobs as data scientists more efficient. Can we take away the grunt work that we do on a day-to-day basis and automate that by using a software system? That area of research ran its course until about 2015, when we realized we still weren’t able to speed up this development process.

    Then another thread emerged. There are a lot of problems that could be solved with data, and they come from experts who know those problems, who live with them on a daily basis. These individuals have very little to do with machine learning or software engineering. How do we bring them into the fold? That is really the next frontier.

    There are three areas where these domain experts have strong input in a machine learning system. The first is defining the problem itself and then helping to formulate it as a prediction task to be solved by a machine learning model. Second, they know how the data have been collected, so they also know intuitively how to process that data. And then third, at the end, machine learning models only give you a very tiny part of a solution — they just give you a prediction. The output of a machine learning model is just one input to help a domain expert get to a decision or action.

    Q: What steps of the machine learning pipeline are the most difficult to automate, and why has automating them been so challenging?

    A: The problem-formulation part is extremely difficult to automate. For example, if I am a researcher who wants to get more government funding, and I have a lot of data about the content of the research proposals that I write and whether or not I receive funding, can machine learning help there? We don’t know yet. In problem formulation, I use my domain expertise to translate the problem into something that is more tangible to predict, and that requires somebody who knows the domain very well. And he or she also knows how to use that information post-prediction. That problem is refusing to be automated.

    There is one part of problem-formulation that could be automated. It turns out that we can look at the data and mathematically express several possible prediction tasks automatically. Then we can share those prediction tasks with the domain expert to see if any of them would help in the larger problem they are trying to tackle. Then once you pick the prediction task, there are a lot of intermediate steps you do, including feature engineering, modeling, etc., that are very mechanical steps and easy to automate.

    But defining the prediction tasks has typically been a collaborative effort between data scientists and domain experts because, unless you know the domain, you can’t translate the domain problem into a prediction task. And then sometimes domain experts don’t know what is meant by “prediction.” That leads to the major, significant back and forth in the process. If you automate that step, then machine learning penetration and the use of data to create meaningful predictions will increase tremendously.

    Then what happens after the machine learning model gives a prediction? We can automate the software and technology part of it, but at the end of the day, it is root cause analysis and human intuition and decision making. We can augment them with a lot of tools, but we can’t fully automate that.

    Q: What do you hope to achieve with the seven-tiered framework for evaluating AutoML systems that you outlined in your paper?

    A: My hope is that people start to recognize that some levels of automation have already been achieved and some still need to be tackled. In the research community, we tend to focus on what we are comfortable with. We have gotten used to automating certain steps, and then we just stick to it. Automating these other parts of the machine learning solution development is very important, and that is where the biggest bottlenecks remain.

    My second hope is that researchers will very clearly understand what domain expertise means. A lot of this AutoML work is still being conducted by academics, and the problem is that we often don’t do applied work. There is not a crystal-clear definition of what a domain expert is and in itself, “domain expert,” is a very nebulous phrase. What we mean by domain expert is the expert in the problem you are trying to solve with machine learning. And I am hoping that everyone unifies around that because that would make things so much clearer.

    I still believe that we are not able to build that many models for that many problems, but even for the ones that we are building, the majority of them are not getting deployed and used in day-to-day life. The output of machine learning is just going to be another data point, an augmented data point, in someone’s decision making. How they make those decisions, based on that input, how that will change their behavior, and how they will adapt their style of working, that is still a big, open question. Once we automate everything, that is what’s next.

    We have to determine what has to fundamentally change in the day-to-day workflow of someone giving loans at a bank, or an educator trying to decide whether he or she should change the assignments in an online class. How are they going to use machine learning’s outputs? We need to focus on the fundamental things we have to build out to make machine learning more usable. More

  • in

    Study: Global cancer risk from burning organic matter comes from unregulated chemicals

    Whenever organic matter is burned, such as in a wildfire, a power plant, a car’s exhaust, or in daily cooking, the combustion releases polycyclic aromatic hydrocarbons (PAHs) — a class of pollutants that is known to cause lung cancer.

    There are more than 100 known types of PAH compounds emitted daily into the atmosphere. Regulators, however, have historically relied on measurements of a single compound, benzo(a)pyrene, to gauge a community’s risk of developing cancer from PAH exposure. Now MIT scientists have found that benzo(a)pyrene may be a poor indicator of this type of cancer risk.

    In a modeling study appearing today in the journal GeoHealth, the team reports that benzo(a)pyrene plays a small part — about 11 percent — in the global risk of developing PAH-associated cancer. Instead, 89 percent of that cancer risk comes from other PAH compounds, many of which are not directly regulated.

    Interestingly, about 17 percent of PAH-associated cancer risk comes from “degradation products” — chemicals that are formed when emitted PAHs react in the atmosphere. Many of these degradation products can in fact be more toxic than the emitted PAH from which they formed.

    The team hopes the results will encourage scientists and regulators to look beyond benzo(a)pyrene, to consider a broader class of PAHs when assessing a community’s cancer risk.

    “Most of the regulatory science and standards for PAHs are based on benzo(a)pyrene levels. But that is a big blind spot that could lead you down a very wrong path in terms of assessing whether cancer risk is improving or not, and whether it’s relatively worse in one place than another,” says study author Noelle Selin, a professor in MIT’s Institute for Data, Systems and Society, and the Department of Earth, Atmospheric and Planetary Sciences.

    Selin’s MIT co-authors include Jesse Kroll, Amy Hrdina, Ishwar Kohale, Forest White, and Bevin Engelward, and Jamie Kelly (who is now at University College London). Peter Ivatt and Mathew Evans at the University of York are also co-authors.

    Chemical pixels

    Benzo(a)pyrene has historically been the poster chemical for PAH exposure. The compound’s indicator status is largely based on early toxicology studies. But recent research suggests the chemical may not be the PAH representative that regulators have long relied upon.   

    “There has been a bit of evidence suggesting benzo(a)pyrene may not be very important, but this was from just a few field studies,” says Kelly, a former postdoc in Selin’s group and the study’s lead author.

    Kelly and his colleagues instead took a systematic approach to evaluate benzo(a)pyrene’s suitability as a PAH indicator. The team began by using GEOS-Chem, a global, three-dimensional chemical transport model that breaks the world into individual grid boxes and simulates within each box the reactions and concentrations of chemicals in the atmosphere.

    They extended this model to include chemical descriptions of how various PAH compounds, including benzo(a)pyrene, would react in the atmosphere. The team then plugged in recent data from emissions inventories and meteorological observations, and ran the model forward to simulate the concentrations of various PAH chemicals around the world over time.

    Risky reactions

    In their simulations, the researchers started with 16 relatively well-studied PAH chemicals, including benzo(a)pyrene, and traced the concentrations of these chemicals, plus the concentration of their degradation products over two generations, or chemical transformations. In total, the team evaluated 48 PAH species.

    They then compared these concentrations with actual concentrations of the same chemicals, recorded by monitoring stations around the world. This comparison was close enough to show that the model’s concentration predictions were realistic.

    Then within each model’s grid box, the researchers related the concentration of each PAH chemical to its associated cancer risk; to do this, they had to develop a new method based on previous studies in the literature to avoid double-counting risk from the different chemicals. Finally, they overlaid population density maps to predict the number of cancer cases globally, based on the concentration and toxicity of a specific PAH chemical in each location.

    Dividing the cancer cases by population produced the cancer risk associated with that chemical. In this way, the team calculated the cancer risk for each of the 48 compounds, then determined each chemical’s individual contribution to the total risk.

    This analysis revealed that benzo(a)pyrene had a surprisingly small contribution, of about 11 percent, to the overall risk of developing cancer from PAH exposure globally. Eighty-nine percent of cancer risk came from other chemicals. And 17 percent of this risk arose from degradation products.

    “We see places where you can find concentrations of benzo(a)pyrene are lower, but the risk is higher because of these degradation products,” Selin says. “These products can be orders of magnitude more toxic, so the fact that they’re at tiny concentrations doesn’t mean you can write them off.”

    When the researchers compared calculated PAH-associated cancer risks around the world, they found significant differences depending on whether that risk calculation was based solely on concentrations of benzo(a)pyrene or on a region’s broader mix of PAH compounds.

    “If you use the old method, you would find the lifetime cancer risk is 3.5 times higher in Hong Kong versus southern India, but taking into account the differences in PAH mixtures, you get a difference of 12 times,” Kelly says. “So, there’s a big difference in the relative cancer risk between the two places. And we think it’s important to expand the group of compounds that regulators are thinking about, beyond just a single chemical.”

    The team’s study “provides an excellent contribution to better understanding these ubiquitous pollutants,” says Elisabeth Galarneau, an air quality expert and PhD research scientist in Canada’s Department of the Environment. “It will be interesting to see how these results compare to work being done elsewhere … to pin down which (compounds) need to be tracked and considered for the protection of human and environmental health.”

    This research was conducted in MIT’s Superfund Research Center and is supported in part by the National Institute of Environmental Health Sciences Superfund Basic Research Program, and the National Institutes of Health. More

  • in

    Data flow’s decisive role on the global stage

    In 2016, Meicen Sun came to a profound realization: “The control of digital information will lie at the heart of all the big questions and big contentions in politics.” A graduate student in her final year of study who is specializing in international security and the political economy of technology, Sun vividly recalls the emergence of the internet “as a democratizing force, an opener, an equalizer,” helping give rise to the Arab Spring. But she was also profoundly struck when nations in the Middle East and elsewhere curbed internet access to throttle citizens’ efforts to speak and mobilize freely.

    During her undergraduate and graduate studies, which came to focus on China and its expanding global role, Sun became convinced that digital constraints initially intended to prevent the free flow of ideas were also having enormous and growing economic impacts.

    “With an exceptionally high mobile internet adoption rate and the explosion of indigenous digital apps, China’s digital economy was surging, helping to drive the nation’s broader economic growth and international competitiveness,” Sun says. “Yet at the same time, the country maintained the most tightly controlled internet ecosystem in the world.”

    Sun set out to explore this apparent paradox in her dissertation. Her research to date has yielded both novel findings and troubling questions.  

    “Through its control of the internet, China has in effect provided protectionist benefits to its own data-intensive domestic sectors,” she says. “If there is a benefit to imposing internet control, given the absence of effective international regulations, does this give authoritarian states an advantage in trade and national competitiveness?” Following this thread, Sun asks, “What might this mean for the future of democracy as the world grows increasingly dependent on digital technology?”

    Protect or innovate

    Early in her graduate program, classes in capitalism and technology and public policy, says Sun, “cemented for me the idea of data as a factor of production, and the importance of cross-border information flow in making a country innovative.” This central premise serves as a springboard for Sun’s doctoral studies.

    In a series of interconnected research papers using China as her primary case, she is examining the double-edged nature of internet limits. “They accord protectionist benefits to domestic data-internet-intensive sectors, on the one hand, but on the other, act as a potential longer-term deterrent to the country’s capacity to innovate.”

    To pursue her doctoral project, advised by professor of political science Kenneth Oye, Sun is extracting data from a multitude of sources, including a website that has been routinely testing web domain accessibility from within China since 2011. This allows her to pin down when and to what degree internet control occurs. She can then compare this information to publicly available records on the expansion or contraction of data-intensive industrial sectors, enabling her to correlate internet control to a sector’s performance.

    Sun has also compiled datasets for firm-level revenue, scientific citations, and patents that permit her to measure aspects of China’s innovation culture. In analyzing her data she leverages both quantitative and qualitative methods, including one co-developed by her dissertation co-advisor, associate professor of political science In Song Kim. Her initial analysis suggests internet control prevents scholars from accessing knowledge available on foreign websites, and that if sustained, such control could take a toll on the Chinese economy over time.

    Of particular concern is the possibility that the economic success that flows from strict internet controls, as exemplified by the Chinese model, may encourage the rise of similar practices among emerging states or those in political flux.

    “The grim implication of my research is that without international regulation on information flow restrictions, democracies will be at a disadvantage against autocracies,” she says. “No matter how short-term or narrow these curbs are, they confer concrete benefits on certain economic sectors.”

    Data, politics, and economy

    Sun got a quick start as a student of China and its role in the world. She was born in Xiamen, a coastal Chinese city across from Taiwan, to academic parents who cultivated her interest in international politics. “My dad would constantly talk to me about global affairs, and he was passionate about foreign policy,” says Sun.

    Eager for education and a broader view of the world, Sun took a scholarship at 15 to attend school in Singapore. “While this experience exposed me to a variety of new ideas and social customs, I felt the itch to travel even farther away, and to meet people with different backgrounds and viewpoints from mine,” than she says.

    Sun attended Princeton University where, after two years sticking to her “comfort zone” — writing and directing plays and composing music for them — she underwent a process of intellectual transition. Political science classes opened a window onto a larger landscape to which she had long been connected: China’s behavior as a rising power and the shifting global landscape.

    She completed her undergraduate degree in politics, and followed up with a master’s degree in international relations at the University of Pennsylvania, where she focused on China-U.S. relations and China’s participation in international institutions. She was on the path to completing a PhD at Penn when, Sun says, “I became confident in my perception that digital technology, and especially information sharing, were becoming critically important factors in international politics, and I felt a strong desire to devote my graduate studies, and even my career, to studying these topics,”

    Certain that the questions she hoped to pursue could best be addressed through an interdisciplinary approach with those working on similar issues, Sun began her doctoral program anew at MIT.

    “Doer mindset”

    Sun is hopeful that her doctoral research will prove useful to governments, policymakers, and business leaders. “There are a lot of developing states actively shopping between data governance and development models for their own countries,” she says. “My findings around the pros and cons of information flow restrictions should be of interest to leaders in these places, and to trade negotiators and others dealing with the global governance of data and what a fair playing field for digital trade would be.”

    Sun has engaged directly with policy and industry experts through her fellowships with the World Economic Forum and the Pacific Forum. And she has embraced questions that touch on policy outside of her immediate research: Sun is collaborating with her dissertation co-advisor, MIT Sloan Professor Yasheng Huang, on a study of the political economy of artificial intelligence in China for the MIT Task Force on the Work of the Future.

    This year, as she writes her dissertation papers, Sun will be based at Georgetown University, where she has a Mortara Center Global Political Economy Project Predoctoral Fellowship. In Washington, she will continue her journey to becoming a “policy-minded scholar, a thinker with a doer mindset, whose findings have bearing on things that happen in the world.” More

  • in

    How quickly do algorithms improve?

    Algorithms are sort of like a parent to a computer. They tell the computer how to make sense of information so they can, in turn, make something useful out of it.

    The more efficient the algorithm, the less work the computer has to do. For all of the technological progress in computing hardware, and the much debated lifespan of Moore’s Law, computer performance is only one side of the picture.

    Behind the scenes a second trend is happening: Algorithms are being improved, so in turn less computing power is needed. While algorithmic efficiency may have less of a spotlight, you’d definitely notice if your trusty search engine suddenly became one-tenth as fast, or if moving through big datasets felt like wading through sludge.

    This led scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) to ask: How quickly do algorithms improve?  

    Existing data on this question were largely anecdotal, consisting of case studies of particular algorithms that were assumed to be representative of the broader scope. Faced with this dearth of evidence, the team set off to crunch data from 57 textbooks and more than 1,110 research papers, to trace the history of when algorithms got better. Some of the research papers directly reported how good new algorithms were, and others needed to be reconstructed by the authors using “pseudocode,” shorthand versions of the algorithm that describe the basic details.

    In total, the team looked at 113 “algorithm families,” sets of algorithms solving the same problem that had been highlighted as most important by computer science textbooks. For each of the 113, the team reconstructed its history, tracking each time a new algorithm was proposed for the problem and making special note of those that were more efficient. Ranging in performance and separated by decades, starting from the 1940s to now, the team found an average of eight algorithms per family, of which a couple improved its efficiency. To share this assembled database of knowledge, the team also created Algorithm-Wiki.org.

    The scientists charted how quickly these families had improved, focusing on the most-analyzed feature of the algorithms — how fast they could guarantee to solve the problem (in computer speak: “worst-case time complexity”). What emerged was enormous variability, but also important insights on how transformative algorithmic improvement has been for computer science.

    For large computing problems, 43 percent of algorithm families had year-on-year improvements that were equal to or larger than the much-touted gains from Moore’s Law. In 14 percent of problems, the improvement to performance from algorithms vastly outpaced those that have come from improved hardware. The gains from algorithm improvement were particularly large for big-data problems, so the importance of those advancements has grown in recent decades.

    The single biggest change that the authors observed came when an algorithm family transitioned from exponential to polynomial complexity. The amount of effort it takes to solve an exponential problem is like a person trying to guess a combination on a lock. If you only have a single 10-digit dial, the task is easy. With four dials like a bicycle lock, it’s hard enough that no one steals your bike, but still conceivable that you could try every combination. With 50, it’s almost impossible — it would take too many steps. Problems that have exponential complexity are like that for computers: As they get bigger they quickly outpace the ability of the computer to handle them. Finding a polynomial algorithm often solves that, making it possible to tackle problems in a way that no amount of hardware improvement can.

    As rumblings of Moore’s Law coming to an end rapidly permeate global conversations, the researchers say that computing users will increasingly need to turn to areas like algorithms for performance improvements. The team says the findings confirm that historically, the gains from algorithms have been enormous, so the potential is there. But if gains come from algorithms instead of hardware, they’ll look different. Hardware improvement from Moore’s Law happens smoothly over time, and for algorithms the gains come in steps that are usually large but infrequent. 

    “This is the first paper to show how fast algorithms are improving across a broad range of examples,” says Neil Thompson, an MIT research scientist at CSAIL and the Sloan School of Management and senior author on the new paper. “Through our analysis, we were able to say how many more tasks could be done using the same amount of computing power after an algorithm improved. As problems increase to billions or trillions of data points, algorithmic improvement becomes substantially more important than hardware improvement. In an era where the environmental footprint of computing is increasingly worrisome, this is a way to improve businesses and other organizations without the downside.”

    Thompson wrote the paper alongside MIT visiting student Yash Sherry. The paper is published in the Proceedings of the IEEE. The work was funded by the Tides foundation and the MIT Initiative on the Digital Economy. More

  • in

    Research collaboration puts climate-resilient crops in sight

    Any houseplant owner knows that changes in the amount of water or sunlight a plant receives can put it under immense stress. A dying plant brings certain disappointment to anyone with a green thumb. 

    But for farmers who make their living by successfully growing plants, and whose crops may nourish hundreds or thousands of people, the devastation of failing flora is that much greater. As climate change is poised to cause increasingly unpredictable weather patterns globally, crops may be subject to more extreme environmental conditions like droughts, fluctuating temperatures, floods, and wildfire. 

    Climate scientists and food systems researchers worry about the stress climate change may put on crops, and on global food security. In an ambitious interdisciplinary project funded by the Abdul Latif Jameel Water and Food Systems Lab (J-WAFS), David Des Marais, the Gale Assistant Professor in the Department of Civil and Environmental Engineering at MIT, and Caroline Uhler, an associate professor in the MIT Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, are investigating how plant genes communicate with one another under stress. Their research results can be used to breed plants more resilient to climate change.

    Crops in trouble

    Governing plants’ responses to environmental stress are gene regulatory networks, or GRNs, which guide the development and behaviors of living things. A GRN may be comprised of thousands of genes and proteins that all communicate with one another. GRNs help a particular cell, tissue, or organism respond to environmental changes by signaling certain genes to turn their expression on or off.

    Even seemingly minor or short-term changes in weather patterns can have large effects on crop yield and food security. An environmental trigger, like a lack of water during a crucial phase of plant development, can turn a gene on or off, and is likely to affect many others in the GRN. For example, without water, a gene enabling photosynthesis may switch off. This can create a domino effect, where the genes that rely on those regulating photosynthesis are silenced, and the cycle continues. As a result, when photosynthesis is halted, the plant may experience other detrimental side effects, like no longer being able to reproduce or defend against pathogens. The chain reaction could even kill a plant before it has the chance to be revived by a big rain.

    Des Marais says he wishes there was a way to stop those genes from completely shutting off in such a situation. To do that, scientists would need to better understand how exactly gene networks respond to different environmental triggers. Bringing light to this molecular process is exactly what he aims to do in this collaborative research effort.

    Solving complex problems across disciplines

    Despite their crucial importance, GRNs are difficult to study because of how complex and interconnected they are. Usually, to understand how a particular gene is affecting others, biologists must silence one gene and see how the others in the network respond. 

    For years, scientists have aspired to an algorithm that could synthesize the massive amount of information contained in GRNs to “identify correct regulatory relationships among genes,” according to a 2019 article in the Encyclopedia of Bioinformatics and Computational Biology. 

    “A GRN can be seen as a large causal network, and understanding the effects that silencing one gene has on all other genes requires understanding the causal relationships among the genes,” says Uhler. “These are exactly the kinds of algorithms my group develops.”

    Des Marais and Uhler’s project aims to unravel these complex communication networks and discover how to breed crops that are more resilient to the increased droughts, flooding, and erratic weather patterns that climate change is already causing globally.

    In addition to climate change, by 2050, the world will demand 70 percent more food to feed a booming population. “Food systems challenges cannot be addressed individually in disciplinary or topic area silos,” says Greg Sixt, J-WAFS’ research manager for climate and food systems. “They must be addressed in a systems context that reflects the interconnected nature of the food system.”

    Des Marais’ background is in biology, and Uhler’s in statistics. “Dave’s project with Caroline was essentially experimental,” says Renee J. Robins, J-WAFS’ executive director. “This kind of exploratory research is exactly what the J-WAFS seed grant program is for.”

    Getting inside gene regulatory networks

    Des Marais and Uhler’s work begins in a windowless basement on MIT’s campus, where 300 genetically identical Brachypodium distachyon plants grow in large, temperature-controlled chambers. The plant, which contains more than 30,000 genes, is a good model for studying important cereal crops like wheat, barley, maize, and millet. For three weeks, all plants receive the same temperature, humidity, light, and water. Then, half are slowly tapered off water, simulating drought-like conditions.

    Six days into the forced drought, the plants are clearly suffering. Des Marais’ PhD student Jie Yun takes tissues from 50 hydrated and 50 dry plants, freezes them in liquid nitrogen to immediately halt metabolic activity, grinds them up into a fine powder, and chemically separates the genetic material. The genes from all 100 samples are then sequenced at a lab across the street.

    The team is left with a spreadsheet listing the 30,000 genes found in each of the 100 plants at the moment they were frozen, and how many copies there were. Uhler’s PhD student Anastasiya Belyaeva inputs the massive spreadsheet into the computer program she developed and runs her novel algorithm. Within a few hours, the group can see which genes were most active in one condition over another, how the genes were communicating, and which were causing changes in others. 

    The methodology captures important subtleties that could allow researchers to eventually alter gene pathways and breed more resilient crops. “When you expose a plant to drought stress, it’s not like there’s some canonical response,” Des Marais says. “There’s lots of things going on. It’s turning this physiologic process up, this one down, this one didn’t exist before, and now suddenly is turned on.” 

    In addition to Des Marais and Uhler’s research, J-WAFS has funded projects in food and water from researchers in 29 departments across all five MIT schools as well as the MIT Schwarzman College of Computing. J-WAFS seed grants typically fund seven to eight new projects every year.

    “The grants are really aimed at catalyzing new ideas, providing the sort of support [for MIT researchers] to be pushing boundaries, and also bringing in faculty who may have some interesting ideas that they haven’t yet applied to water or food concerns,” Robins says. “It’s an avenue for researchers all over the Institute to apply their ideas to water and food.”

    Alison Gold is a student in MIT’s Graduate Program in Science Writing. More

  • in

    MIT appoints members of new faculty committee to drive climate action plan

    In May, responding to the world’s accelerating climate crisis, MIT issued an ambitious new plan, “Fast Forward: MIT’s Climate Action Plan for the Decade.” The plan outlines a broad array of new and expanded initiatives across campus to build on the Institute’s longstanding climate work.

    Now, to unite these varied climate efforts, maximize their impact, and identify new ways for MIT to contribute climate solutions, the Institute has appointed more than a dozen faculty members to a new committee established by the Fast Forward plan, named the Climate Nucleus.

    The committee includes leaders of a number of climate- and energy-focused departments, labs, and centers that have significant responsibilities under the plan. Its membership spans all five schools and the MIT Schwarzman College of Computing. Professors Noelle Selin and Anne White have agreed to co-chair the Climate Nucleus for a term of three years.

    “I am thrilled and grateful that Noelle and Anne have agreed to step up to this important task,” says Maria T. Zuber, MIT’s vice president for research. “Under their leadership, I’m confident that the Climate Nucleus will bring new ideas and new energy to making the strategy laid out in the climate action plan a reality.”

    The Climate Nucleus has broad responsibility for the management and implementation of the Fast Forward plan across its five areas of action: sparking innovation, educating future generations, informing and leveraging government action, reducing MIT’s own climate impact, and uniting and coordinating all of MIT’s climate efforts.

    Over the next few years, the nucleus will aim to advance MIT’s contribution to a two-track approach to decarbonizing the global economy, an approach described in the Fast Forward plan. First, humanity must go as far and as fast as it can to reduce greenhouse gas emissions using existing tools and methods. Second, societies need to invest in, invent, and deploy new tools — and promote new institutions and policies — to get the global economy to net-zero emissions by mid-century.

    The co-chairs of the nucleus bring significant climate and energy expertise, along with deep knowledge of the MIT community, to their task.

    Selin is a professor with joint appointments in the Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences. She is also the director of the Technology and Policy Program. She began at MIT in 2007 as a postdoc with the Center for Global Change Science and the Joint Program on the Science and Policy of Global Change. Her research uses modeling to inform decision-making on air pollution, climate change, and hazardous substances.

    “Climate change affects everything we do at MIT. For the new climate action plan to be effective, the Climate Nucleus will need to engage the entire MIT community and beyond, including policymakers as well as people and communities most affected by climate change,” says Selin. “I look forward to helping to guide this effort.”

    White is the School of Engineering’s Distinguished Professor of Engineering and the head of the Department of Nuclear Science and Engineering. She joined the MIT faculty in 2009 and has also served as the associate director of MIT’s Plasma Science and Fusion Center. Her research focuses on assessing and refining the mathematical models used in the design of fusion energy devices, such as tokamaks, which hold promise for delivering limitless zero-carbon energy.

    “The latest IPCC report underscores the fact that we have no time to lose in decarbonizing the global economy quickly. This is a problem that demands we use every tool in our toolbox — and develop new ones — and we’re committed to doing that,” says White, referring to an August 2021 report from the Intergovernmental Panel on Climate Change, a UN climate science body, that found that climate change has already affected every region on Earth and is intensifying. “We must train future technical and policy leaders, expand opportunities for students to work on climate problems, and weave sustainability into every one of MIT’s activities. I am honored to be a part of helping foster this Institute-wide collaboration.”

    A first order of business for the Climate Nucleus will be standing up three working groups to address specific aspects of climate action at MIT: climate education, climate policy, and MIT’s own carbon footprint. The working groups will be responsible for making progress on their particular areas of focus under the plan and will make recommendations to the nucleus on ways of increasing MIT’s effectiveness and impact. The working groups will also include student, staff, and alumni members, so that the entire MIT community has the opportunity to contribute to the plan’s implementation.  

    The nucleus, in turn, will report and make regular recommendations to the Climate Steering Committee, a senior-level team consisting of Zuber; Richard Lester, the associate provost for international activities; Glen Shor, the executive vice president and treasurer; and the deans of the five schools and the MIT Schwarzman College of Computing. The new plan created the Climate Steering Committee to ensure that climate efforts will receive both the high-level attention and the resources needed to succeed.

    Together the new committees and working groups are meant to form a robust new infrastructure for uniting and coordinating MIT’s climate action efforts in order to maximize their impact. They replace the Climate Action Advisory Committee, which was created in 2016 following the release of MIT’s first climate action plan.

    In addition to Selin and White, the members of the Climate Nucleus are:

    Bob Armstrong, professor in the Department of Chemical Engineering and director of the MIT Energy Initiative;
    Dara Entekhabi, professor in the departments of Civil and Environmental Engineering and Earth, Atmospheric and Planetary Sciences;
    John Fernández, professor in the Department of Architecture and director of the Environmental Solutions Initiative;
    Stefan Helmreich, professor in the Department of Anthropology;
    Christopher Knittel, professor in the MIT Sloan School of Management and director of the Center for Energy and Environmental Policy Research;
    John Lienhard, professor in the Department of Mechanical Engineering and director of the Abdul Latif Jameel Water and Food Systems Lab;
    Julie Newman, director of the Office of Sustainability and lecturer in the Department of Urban Studies and Planning;
    Elsa Olivetti, professor in the Department of Materials Science and Engineering and co-director of the Climate and Sustainability Consortium;
    Christoph Reinhart, professor in the Department of Architecture and director of the Building Technology Program;
    John Sterman, professor in the MIT Sloan School of Management and director of the Sloan Sustainability Initiative;
    Rob van der Hilst, professor and head of the Department of Earth, Atmospheric and Planetary Sciences; and
    Chris Zegras, professor and head of the Department of Urban Studies and Planning. More

  • in

    End-to-end supply chain transparency

    For years, companies have managed their extended supply chains with intermittent audits and certifications while attempting to persuade their suppliers to adhere to certain standards and codes of conduct. But they’ve lacked the concrete data necessary to prove their supply chains were working as they should. They most likely had baseline data about their suppliers — what they bought and who they bought it from — but knew little else about the rest of the supply chain.

    With Sourcemap, companies can now trace their supply chains from raw material to finished good with certainty, keeping track of the mines and farms that produce the commodities they rely on to take their goods to market. This unprecedented level of transparency provides Sourcemap’s customers with the assurance that the entire end-to-end supply chain operates within their standards while living up to social and environmental targets.

    And they’re doing it at scale for large multinationals across the food, agricultural, automotive, tech, and apparel industries. Thanks to Sourcemap founder and CEO Leonardo Bonanni MA ’03, SM ’05, PhD ’10, companies like VF Corporation, owner of brands like Timberland, The North Face, Mars, Hershey, and Ferrero, now have enough data to confidently tell the story of how they’re sourcing their raw materials.

    “Coming from the Media Lab, we recognized early on the power of the cloud, the power of social networking-type databases and smartphone diffusion around the world,” says Bonanni of his company’s MIT roots. Rather than providing intermittent glances at the supply chain via an auditor, Sourcemap collects data continuously, in real-time, every step of the way, flagging anything that could indicate counterfeiting, adulteration, fraud, waste, or abuse.

    “We’ve taken our customers from a situation where they had very little control to a world where they have direct visibility over their entire global operations, even allowing them to see ahead of time — before a container reaches the port — whether there is any indication that there might be something wrong with it,” says Bonanni.

    The key problem Sourcemap addresses is a lack of data in companies’ supply chain management databases. According to Bonanni, most Sourcemap customers have invested millions of dollars in enterprise resource planning (ERP) databases, which provide information about internal operations and direct suppliers, but fall short when it comes to global operations, where their secondary and tertiary suppliers operate. Built on relational databases, ERP systems have been around for more than 40 years and work well for simple, static data structures. But they aren’t agile enough to handle big data and rapidly evolving, complex data structures

    Sourcemap, on the other hand, uses NoSQL (non-relational) database technology, which is more flexible, cost-efficient, and scalable. “Our platform is like a LinkedIn for the supply chain,” explains Bonanni. Customers provide information about where they buy their raw materials, the suppliers get invited to the network and provide information to validate those relationships, right down to the farms and the mines where the raw materials are extracted — which is often where the biggest risks lie.

    Initially, the entire supply chain database of a Sourcemap customer might amount to a few megabytes of spreadsheets listing their purchase orders and the names of their suppliers. Sourcemap delivers terabytes of data that paint a detailed picture of the supply chain, capturing everything, right down to the moment a farmer in West Africa delivers cocoa beans to a warehouse, onto a truck heading to a port, to a factory, all the way to the finished goods.

    “We’ve seen the amount of data collected grow by a factor of 1 million, which tells us that the world is finally ready for full visibility of supply chains,” says Bonanni. “The fact is that we’ve seen supply chain transparency go from a fringe concern to a broad-based requirement as a license to operate in most of Europe and North America,” says Bonanni.

    These days, disruptions in supply chains, combined with price volatility and new laws requiring companies to prove that the goods they import were not made illegally (such as by causing deforestation or involving forced or child labor), means that companies are often required to know where they source their raw materials from, even if they only import the materials through an intermediary.

    Sourcemap uses its full suite of tools to walk customers through a step-by-step process that maps their suppliers while measuring performance, ultimately verifying the entire supply chain and providing them with the confidence to import goods while being customs-compliant. At the end of the day, Sourcemap customers can communicate to their stakeholders and the end consumer exactly where their commodities come from while ensuring that social, environmental, and compliance standards are met.

    The company was recently named to the newest cohort of firms honored by the MIT Startup Exchange (STEX) as STEX25 startups. Bonanni is quick to point out the benefits of STEX and of MIT’s Industrial Liaison Program (ILP): “Our best feedback and our most constructive relationships have been with companies that sponsored our research early on at the Media Lab and ILP,” he says. “The innovative exchange of ideas inherent in the MIT startup ecosystem has helped to build up Sourcemap as a company and to grow supply chain transparency as a future-facing technology that more and more companies are now scrambling to adopt.” More