More stories

  • in

    Q&A: More-sustainable concrete with machine learning

    As a building material, concrete withstands the test of time. Its use dates back to early civilizations, and today it is the most popular composite choice in the world. However, it’s not without its faults. Production of its key ingredient, cement, contributes 8-9 percent of the global anthropogenic CO2 emissions and 2-3 percent of energy consumption, which is only projected to increase in the coming years. With aging United States infrastructure, the federal government recently passed a milestone bill to revitalize and upgrade it, along with a push to reduce greenhouse gas emissions where possible, putting concrete in the crosshairs for modernization, too.

    Elsa Olivetti, the Esther and Harold E. Edgerton Associate Professor in the MIT Department of Materials Science and Engineering, and Jie Chen, MIT-IBM Watson AI Lab research scientist and manager, think artificial intelligence can help meet this need by designing and formulating new, more sustainable concrete mixtures, with lower costs and carbon dioxide emissions, while improving material performance and reusing manufacturing byproducts in the material itself. Olivetti’s research improves environmental and economic sustainability of materials, and Chen develops and optimizes machine learning and computational techniques, which he can apply to materials reformulation. Olivetti and Chen, along with their collaborators, have recently teamed up for an MIT-IBM Watson AI Lab project to make concrete more sustainable for the benefit of society, the climate, and the economy.

    Q: What applications does concrete have, and what properties make it a preferred building material?

    Olivetti: Concrete is the dominant building material globally with an annual consumption of 30 billion metric tons. That is over 20 times the next most produced material, steel, and the scale of its use leads to considerable environmental impact, approximately 5-8 percent of global greenhouse gas (GHG) emissions. It can be made locally, has a broad range of structural applications, and is cost-effective. Concrete is a mixture of fine and coarse aggregate, water, cement binder (the glue), and other additives.

    Q: Why isn’t it sustainable, and what research problems are you trying to tackle with this project?

    Olivetti: The community is working on several ways to reduce the impact of this material, including alternative fuels use for heating the cement mixture, increasing energy and materials efficiency and carbon sequestration at production facilities, but one important opportunity is to develop an alternative to the cement binder.

    While cement is 10 percent of the concrete mass, it accounts for 80 percent of the GHG footprint. This impact is derived from the fuel burned to heat and run the chemical reaction required in manufacturing, but also the chemical reaction itself releases CO2 from the calcination of limestone. Therefore, partially replacing the input ingredients to cement (traditionally ordinary Portland cement or OPC) with alternative materials from waste and byproducts can reduce the GHG footprint. But use of these alternatives is not inherently more sustainable because wastes might have to travel long distances, which adds to fuel emissions and cost, or might require pretreatment processes. The optimal way to make use of these alternate materials will be situation-dependent. But because of the vast scale, we also need solutions that account for the huge volumes of concrete needed. This project is trying to develop novel concrete mixtures that will decrease the GHG impact of the cement and concrete, moving away from the trial-and-error processes towards those that are more predictive.

    Chen: If we want to fight climate change and make our environment better, are there alternative ingredients or a reformulation we could use so that less greenhouse gas is emitted? We hope that through this project using machine learning we’ll be able to find a good answer.

    Q: Why is this problem important to address now, at this point in history?

    Olivetti: There is urgent need to address greenhouse gas emissions as aggressively as possible, and the road to doing so isn’t necessarily straightforward for all areas of industry. For transportation and electricity generation, there are paths that have been identified to decarbonize those sectors. We need to move much more aggressively to achieve those in the time needed; further, the technological approaches to achieve that are more clear. However, for tough-to-decarbonize sectors, such as industrial materials production, the pathways to decarbonization are not as mapped out.

    Q: How are you planning to address this problem to produce better concrete?

    Olivetti: The goal is to predict mixtures that will both meet performance criteria, such as strength and durability, with those that also balance economic and environmental impact. A key to this is to use industrial wastes in blended cements and concretes. To do this, we need to understand the glass and mineral reactivity of constituent materials. This reactivity not only determines the limit of the possible use in cement systems but also controls concrete processing, and the development of strength and pore structure, which ultimately control concrete durability and life-cycle CO2 emissions.

    Chen: We investigate using waste materials to replace part of the cement component. This is something that we’ve hypothesized would be more sustainable and economic — actually waste materials are common, and they cost less. Because of the reduction in the use of cement, the final concrete product would be responsible for much less carbon dioxide production. Figuring out the right concrete mixture proportion that makes endurable concretes while achieving other goals is a very challenging problem. Machine learning is giving us an opportunity to explore the advancement of predictive modeling, uncertainty quantification, and optimization to solve the issue. What we are doing is exploring options using deep learning as well as multi-objective optimization techniques to find an answer. These efforts are now more feasible to carry out, and they will produce results with reliability estimates that we need to understand what makes a good concrete.

    Q: What kinds of AI and computational techniques are you employing for this?

    Olivetti: We use AI techniques to collect data on individual concrete ingredients, mix proportions, and concrete performance from the literature through natural language processing. We also add data obtained from industry and/or high throughput atomistic modeling and experiments to optimize the design of concrete mixtures. Then we use this information to develop insight into the reactivity of possible waste and byproduct materials as alternatives to cement materials for low-CO2 concrete. By incorporating generic information on concrete ingredients, the resulting concrete performance predictors are expected to be more reliable and transformative than existing AI models.

    Chen: The final objective is to figure out what constituents, and how much of each, to put into the recipe for producing the concrete that optimizes the various factors: strength, cost, environmental impact, performance, etc. For each of the objectives, we need certain models: We need a model to predict the performance of the concrete (like, how long does it last and how much weight does it sustain?), a model to estimate the cost, and a model to estimate how much carbon dioxide is generated. We will need to build these models by using data from literature, from industry, and from lab experiments.

    We are exploring Gaussian process models to predict the concrete strength, going forward into days and weeks. This model can give us an uncertainty estimate of the prediction as well. Such a model needs specification of parameters, for which we will use another model to calculate. At the same time, we also explore neural network models because we can inject domain knowledge from human experience into them. Some models are as simple as multi-layer perceptions, while some are more complex, like graph neural networks. The goal here is that we want to have a model that is not only accurate but also robust — the input data is noisy, and the model must embrace the noise, so that its prediction is still accurate and reliable for the multi-objective optimization.

    Once we have built models that we are confident with, we will inject their predictions and uncertainty estimates into the optimization of multiple objectives, under constraints and under uncertainties.

    Q: How do you balance cost-benefit trade-offs?

    Chen: The multiple objectives we consider are not necessarily consistent, and sometimes they are at odds with each other. The goal is to identify scenarios where the values for our objectives cannot be further pushed simultaneously without compromising one or a few. For example, if you want to further reduce the cost, you probably have to suffer the performance or suffer the environmental impact. Eventually, we will give the results to policymakers and they will look into the results and weigh the options. For example, they may be able to tolerate a slightly higher cost under a significant reduction in greenhouse gas. Alternatively, if the cost varies little but the concrete performance changes drastically, say, doubles or triples, then this is definitely a favorable outcome.

    Q: What kinds of challenges do you face in this work?

    Chen: The data we get either from industry or from literature are very noisy; the concrete measurements can vary a lot, depending on where and when they are taken. There are also substantial missing data when we integrate them from different sources, so, we need to spend a lot of effort to organize and make the data usable for building and training machine learning models. We also explore imputation techniques that substitute missing features, as well as models that tolerate missing features, in our predictive modeling and uncertainty estimate.

    Q: What do you hope to achieve through this work?

    Chen: In the end, we are suggesting either one or a few concrete recipes, or a continuum of recipes, to manufacturers and policymakers. We hope that this will provide invaluable information for both the construction industry and for the effort of protecting our beloved Earth.

    Olivetti: We’d like to develop a robust way to design cements that make use of waste materials to lower their CO2 footprint. Nobody is trying to make waste, so we can’t rely on one stream as a feedstock if we want this to be massively scalable. We have to be flexible and robust to shift with feedstocks changes, and for that we need improved understanding. Our approach to develop local, dynamic, and flexible alternatives is to learn what makes these wastes reactive, so we know how to optimize their use and do so as broadly as possible. We do that through predictive model development through software we have developed in my group to automatically extract data from literature on over 5 million texts and patents on various topics. We link this to the creative capabilities of our IBM collaborators to design methods that predict the final impact of new cements. If we are successful, we can lower the emissions of this ubiquitous material and play our part in achieving carbon emissions mitigation goals.

    Other researchers involved with this project include Stefanie Jegelka, the X-Window Consortium Career Development Associate Professor in the MIT Department of Electrical Engineering and Computer Science; Richard Goodwin, IBM principal researcher; Soumya Ghosh, MIT-IBM Watson AI Lab research staff member; and Kristen Severson, former research staff member. Collaborators included Nghia Hoang, former research staff member with MIT-IBM Watson AI Lab and IBM Research; and Jeremy Gregory, research scientist in the MIT Department of Civil and Environmental Engineering and executive director of the MIT Concrete Sustainability Hub.

    This research is supported by the MIT-IBM Watson AI Lab. More

  • in

    Community policing in the Global South

    Community policing is meant to combat citizen mistrust of the police force. The concept was developed in the mid-20th century to help officers become part of the communities they are responsible for. The hope was that such presence would create a partnership between citizens and the police force, leading to reduced crime and increased trust. Studies in the 1990s from the United States, United Kingdom, and Australia showed that these goals can be achieved in certain circumstances. Many metropolitan areas in the Global North have since included community policing in their techniques.

    But a recently published study of six different sites in the Global South showed no significant positive effect associated with community policing across a range of countries.

    “We found no reduction in crime or insecurity in these communities, and no increase in trust in the police,” says Fotini Christia, an author of the paper, which was published in Science. Christia is the Ford International Professor in the Social Sciences at MIT and the director of the Sociotechnical Systems Research Center (SSRC) within the Institute for Data, Systems, and Society (IDSS). She was one of three on the steering committee for the research, which also included lead author Graeme Blair at the University of California at Los Angeles and Jeremy Weinstein at Stanford University. Fellow MIT political scientist Lily Tsai was also a co-author on the paper.

    In this study, randomized-control trials of community policing initiatives were implemented at sites in Santa Catarina State, Brazil; Medellín, Colombia; Monrovia, Liberia; Sorsogon Province, Philippines; Ugandan rural areas; and two Punjab Province districts in Pakistan. Each suite of interventions was developed based on the needs of the area but consisted of core elements of community policing such as officer recruitment and training, foot patrols, town hall meetings, and problem-oriented policing. The work was done by a collaboration of several social scientists in the United States and abroad. Major funding for this project was provided by the UK Foreign, Commonwealth and Development Office, awarded through the Evidence in Governance and Politics network.

    The null results were determined after interviewing 18,382 citizens and 874 police officers involved in the experiment over six years.

    The strength of these results lies in the size of the collaboration and the care taken in the research design. Input from researchers representing 22 different departments from universities around the world allowed for a broad diversity of study sites across the Global South. And the study was preregistered to establish a common approach to measurement and indicate exactly which effects the researchers were tracking, to avoid any chance of mining the data to find positive effects.

    “This is a pathbreaking study across a diverse set of sites that provides a new understanding about community policing outside of the Western world” says Christopher Winship, the Diker-Tishman Professor of Sociology at Harvard University, who was not an author on the paper.

    Structural overhaul

    The reasons for the failure of community policing to elicit positive results were as varied as the sites themselves, but an important commonality was difficulties in implementation.

    “We saw three common problems: limited resources, a lack of prioritization of the reform, and rapid rotation of officers,” says Blair. “These challenges lead to weaker implementation of community policing than we’ve seen in ‘success stories’ in the U.S. and may explain why community policing didn’t deliver the same results in these Global South contexts.”

    Citizen attendance at community meetings was variable. And then, resources dedicated to following up on problems identified by citizens were scarce. Police officers in the countries represented in the study are often over-stretched, leaving them unable to adequately follow up on their community policing duties.

    For example, Ugandan police stations averaged one motorbike per whole station, and outposts averaged less than one. At the study sites in Pakistan, fewer than 25 percent of issues that arose in community meetings were followed up on. The police officers tried to push the problems through to other agencies that could assist, but those agencies were also underresourced.            

    There was also significant officer turnover. “In many places, we started with and trained one group of officers and ended with a completely different set of folks,” says Christia.

    In the Philippines, only 25 percent of officers were still in the same post 11 months after the start of the study. Not only is it difficult to train new recruits in the methods of community policing with that rate of turnover, it also makes it extremely difficult to build community respect and familiarity with officers.

    Even in the Global North, the success of community policing can vary. As part of their study, the researchers conducted a review of 43 existing randomized trials conducted since the 1970s to determine the success rate of community policing endeavors already in place.

    They found that in these initiatives, problem-oriented policing reduces crime and likely improves perceptions of safety in a community, but there is mixed-to-negative evidence on the benefits of police presence on crime and perceptions of police. 

    That these initiatives struggle to achieve consistently positive results in countries with better resources indicates there is significant work to be done before success can be achieved in the Global South. Improvements in policing in the Global South may require major structural overhauls of the systems to ensure resource availability, encourage community engagement, and enhance officers’ abilities to follow up on issues of concern.

    “Issues of crime and violence are at the top of the policy agenda in the Global South, and this research demonstrates how universities and government partners can work together to identify the most effective strategies from improving people’s sense of safety,” says Weinstein. “While community policing strategies didn’t deliver the anticipated results on their own, the challenges in implementation point to the need for more systemic reforms that provide the necessary resources and align incentives for police to respond to citizens’ primary concerns.” More

  • in

    The reasons behind lithium-ion batteries’ rapid cost decline

    Lithium-ion batteries, those marvels of lightweight power that have made possible today’s age of handheld electronics and electric vehicles, have plunged in cost since their introduction three decades ago at a rate similar to the drop in solar panel prices, as documented by a study published last March. But what brought about such an astonishing cost decline, of about 97 percent?

    Some of the researchers behind that earlier study have now analyzed what accounted for the extraordinary savings. They found that by far the biggest factor was work on research and development, particularly in chemistry and materials science. This outweighed the gains achieved through economies of scale, though that turned out to be the second-largest category of reductions.

    The new findings are being published today in the journal Energy and Environmental Science, in a paper by MIT postdoc Micah Ziegler, recent graduate student Juhyun Song PhD ’19, and Jessika Trancik, a professor in MIT’s Institute for Data, Systems and Society.

    The findings could be useful for policymakers and planners to help guide spending priorities in order to continue the pathway toward ever-lower costs for this and other crucial energy storage technologies, according to Trancik. Their work suggests that there is still considerable room for further improvement in electrochemical battery technologies, she says.

    The analysis required digging through a variety of sources, since much of the relevant information consists of closely held proprietary business data. “The data collection effort was extensive,” Ziegler says. “We looked at academic articles, industry and government reports, press releases, and specification sheets. We even looked at some legal filings that came out. We had to piece together data from many different sources to get a sense of what was happening.” He says they collected “about 15,000 qualitative and quantitative data points, across 1,000 individual records from approximately 280 references.”

    Data from the earliest times are hardest to access and can have the greatest uncertainties, Trancik says, but by comparing different data sources from the same period they have attempted to account for these uncertainties.

    Overall, she says, “we estimate that the majority of the cost decline, more than 50 percent, came from research-and-development-related activities.” That included both private sector and government-funded research and development, and “the vast majority” of that cost decline within that R&D category came from chemistry and materials research.

    That was an interesting finding, she says, because “there were so many variables that people were working on through very different kinds of efforts,” including the design of the battery cells themselves, their manufacturing systems, supply chains, and so on. “The cost improvement emerged from a diverse set of efforts and many people, and not from the work of only a few individuals.”

    The findings about the importance of investment in R&D were especially significant, Ziegler says, because much of this investment happened after lithium-ion battery technology was commercialized, a stage at which some analysts thought the research contribution would become less significant. Over roughly a 20-year period starting five years after the batteries’ introduction in the early 1990s, he says, “most of the cost reduction still came from R&D. The R&D contribution didn’t end when commercialization began. In fact, it was still the biggest contributor to cost reduction.”

    The study took advantage of an analytical approach that Trancik and her team initially developed to analyze the similarly precipitous drop in costs of silicon solar panels over the last few decades. They also applied the approach to understand the rising costs of nuclear energy. “This is really getting at the fundamental mechanisms of technological change,” she says. “And we can also develop these models looking forward in time, which allows us to uncover the levers that people could use to improve the technology in the future.”

    One advantage of the methodology Trancik and her colleagues have developed, she says, is that it helps to sort out the relative importance of different factors when many variables are changing all at once, which typically happens as a technology improves. “It’s not simply adding up the cost effects of these variables,” she says, “because many of these variables affect many different cost components. There’s this kind of intricate web of dependencies.” But the team’s methodology makes it possible to “look at how that overall cost change can be attributed to those variables, by essentially mapping out that network of dependencies,” she says.

    This can help provide guidance on public spending, private investments, and other incentives. “What are all the things that different decision makers could do?” she asks. “What decisions do they have agency over so that they could improve the technology, which is important in the case of low-carbon technologies, where we’re looking for solutions to climate change and we have limited time and limited resources? The new approach allows us to potentially be a bit more intentional about where we make those investments of time and money.”

    “This paper collects data available in a systematic way to determine changes in the cost components of lithium-ion batteries between 1990-1995 and 2010-2015,” says Laura Diaz Anadon, a professor of climate change policy at Cambridge University, who was not connected to this research. “This period was an important one in the history of the technology, and understanding the evolution of cost components lays the groundwork for future work on mechanisms and could help inform research efforts in other types of batteries.”

    The research was supported by the Alfred P. Sloan Foundation, the Environmental Defense Fund, and the MIT Technology and Policy Program. More

  • in

    Design’s new frontier

    In the 1960s, the advent of computer-aided design (CAD) sparked a revolution in design. For his PhD thesis in 1963, MIT Professor Ivan Sutherland developed Sketchpad, a game-changing software program that enabled users to draw, move, and resize shapes on a computer. Over the course of the next few decades, CAD software reshaped how everything from consumer products to buildings and airplanes were designed.

    “CAD was part of the first wave in computing in design. The ability of researchers and practitioners to represent and model designs using computers was a major breakthrough and still is one of the biggest outcomes of design research, in my opinion,” says Maria Yang, Gail E. Kendall Professor and director of MIT’s Ideation Lab.

    Innovations in 3D printing during the 1980s and 1990s expanded CAD’s capabilities beyond traditional injection molding and casting methods, providing designers even more flexibility. Designers could sketch, ideate, and develop prototypes or models faster and more efficiently. Meanwhile, with the push of a button, software like that developed by Professor Emeritus David Gossard of MIT’s CAD Lab could solve equations simultaneously to produce a new geometry on the fly.

    In recent years, mechanical engineers have expanded the computing tools they use to ideate, design, and prototype. More sophisticated algorithms and the explosion of machine learning and artificial intelligence technologies have sparked a second revolution in design engineering.

    Researchers and faculty at MIT’s Department of Mechanical Engineering are utilizing these technologies to re-imagine how the products, systems, and infrastructures we use are designed. These researchers are at the forefront of the new frontier in design.

    Computational design

    Faez Ahmed wants to reinvent the wheel, or at least the bicycle wheel. He and his team at MIT’s Design Computation & Digital Engineering Lab (DeCoDE) use an artificial intelligence-driven design method that can generate entirely novel and improved designs for a range of products — including the traditional bicycle. They create advanced computational methods to blend human-driven design with simulation-based design.

    “The focus of our DeCoDE lab is computational design. We are looking at how we can create machine learning and AI algorithms to help us discover new designs that are optimized based on specific performance parameters,” says Ahmed, an assistant professor of mechanical engineering at MIT.

    For their work using AI-driven design for bicycles, Ahmed and his collaborator Professor Daniel Frey wanted to make it easier to design customizable bicycles, and by extension, encourage more people to use bicycles over transportation methods that emit greenhouse gases.

    To start, the group gathered a dataset of 4,500 bicycle designs. Using this massive dataset, they tested the limits of what machine learning could do. First, they developed algorithms to group bicycles that looked similar together and explore the design space. They then created machine learning models that could successfully predict what components are key in identifying a bicycle style, such as a road bike versus a mountain bike.

    Once the algorithms were good enough at identifying bicycle designs and parts, the team proposed novel machine learning tools that could use this data to create a unique and creative design for a bicycle based on certain performance parameters and rider dimensions.

    Ahmed used a generative adversarial network — or GAN — as the basis of this model. GAN models utilize neural networks that can create new designs based on vast amounts of data. However, using GAN models alone would result in homogeneous designs that lack novelty and can’t be assessed in terms of performance. To address these issues in design problems, Ahmed has developed a new method which he calls “PaDGAN,” performance augmented diverse GAN.

    “When we apply this type of model, what we see is that we can get large improvements in the diversity, quality, as well as novelty of the designs,” Ahmed explains.

    Using this approach, Ahmed’s team developed an open-source computational design tool for bicycles freely available on their lab website. They hope to further develop a set of generalizable tools that can be used across industries and products.

    Longer term, Ahmed has his sights set on loftier goals. He hopes the computational design tools he develops could lead to “design democratization,” putting more power in the hands of the end user.

    “With these algorithms, you can have more individualization where the algorithm assists a customer in understanding their needs and helps them create a product that satisfies their exact requirements,” he adds.

    Using algorithms to democratize the design process is a goal shared by Stefanie Mueller, an associate professor in electrical engineering and computer science and mechanical engineering.

    Personal fabrication

    Platforms like Instagram give users the freedom to instantly edit their photographs or videos using filters. In one click, users can alter the palette, tone, and brightness of their content by applying filters that range from bold colors to sepia-toned or black-and-white. Mueller, X-Window Consortium Career Development Professor, wants to bring this concept of the Instagram filter to the physical world.

    “We want to explore how digital capabilities can be applied to tangible objects. Our goal is to bring reprogrammable appearance to the physical world,” explains Mueller, director of the HCI Engineering Group based out of MIT’s Computer Science and Artificial Intelligence Laboratory.

    Mueller’s team utilizes a combination of smart materials, optics, and computation to advance personal fabrication technologies that would allow end users to alter the design and appearance of the products they own. They tested this concept in a project they dubbed “Photo-Chromeleon.”

    First, a mix of photochromic cyan, magenta, and yellow dies are airbrushed onto an object — in this instance, a 3D sculpture of a chameleon. Using software they developed, the team sketches the exact color pattern they want to achieve on the object itself. An ultraviolet light shines on the object to activate the dyes.

    To actually create the physical pattern on the object, Mueller has developed an optimization algorithm to use alongside a normal office projector outfitted with red, green, and blue LED lights. These lights shine on specific pixels on the object for a given period of time to physically change the makeup of the photochromic pigments.

    “This fancy algorithm tells us exactly how long we have to shine the red, green, and blue light on every single pixel of an object to get the exact pattern we’ve programmed in our software,” says Mueller.

    Giving this freedom to the end user enables limitless possibilities. Mueller’s team has applied this technology to iPhone cases, shoes, and even cars. In the case of shoes, Mueller envisions a shoebox embedded with UV and LED light projectors. Users could put their shoes in the box overnight and the next day have a pair of shoes in a completely new pattern.

    Mueller wants to expand her personal fabrication methods to the clothes we wear. Rather than utilize the light projection technique developed in the PhotoChromeleon project, her team is exploring the possibility of weaving LEDs directly into clothing fibers, allowing people to change their shirt’s appearance as they wear it. These personal fabrication technologies could completely alter consumer habits.

    “It’s very interesting for me to think about how these computational techniques will change product design on a high level,” adds Mueller. “In the future, a consumer could buy a blank iPhone case and update the design on a weekly or daily basis.”

    Computational fluid dynamics and participatory design

    Another team of mechanical engineers, including Sili Deng, the Brit (1961) & Alex (1949) d’Arbeloff Career Development Professor, are developing a different kind of design tool that could have a large impact on individuals in low- and middle-income countries across the world.

    As Deng walked down the hallway of Building 1 on MIT’s campus, a monitor playing a video caught her eye. The video featured work done by mechanical engineers and MIT D-Lab on developing cleaner burning briquettes for cookstoves in Uganda. Deng immediately knew she wanted to get involved.

    “As a combustion scientist, I’ve always wanted to work on such a tangible real-world problem, but the field of combustion tends to focus more heavily on the academic side of things,” explains Deng.

    After reaching out to colleagues in MIT D-Lab, Deng joined a collaborative effort to develop a new cookstove design tool for the 3 billion people across the world who burn solid fuels to cook and heat their homes. These stoves often emit soot and carbon monoxide, leading not only to millions of deaths each year, but also worsening the world’s greenhouse gas emission problem.

    The team is taking a three-pronged approach to developing this solution, using a combination of participatory design, physical modeling, and experimental validation to create a tool that will lead to the production of high-performing, low-cost energy products.

    Deng and her team in the Deng Energy and Nanotechnology Group use physics-based modeling for the combustion and emission process in cookstoves.

    “My team is focused on computational fluid dynamics. We use computational and numerical studies to understand the flow field where the fuel is burned and releases heat,” says Deng.

    These flow mechanics are crucial to understanding how to minimize heat loss and make cookstoves more efficient, as well as learning how dangerous pollutants are formed and released in the process.

    Using computational methods, Deng’s team performs three-dimensional simulations of the complex chemistry and transport coupling at play in the combustion and emission processes. They then use these simulations to build a combustion model for how fuel is burned and a pollution model that predicts carbon monoxide emissions.

    Deng’s models are used by a group led by Daniel Sweeney in MIT D-Lab to test the experimental validation in prototypes of stoves. Finally, Professor Maria Yang uses participatory design methods to integrate user feedback, ensuring the design tool can actually be used by people across the world.

    The end goal for this collaborative team is to not only provide local manufacturers with a prototype they could produce themselves, but to also provide them with a tool that can tweak the design based on local needs and available materials.

    Deng sees wide-ranging applications for the computational fluid dynamics her team is developing.

    “We see an opportunity to use physics-based modeling, augmented with a machine learning approach, to come up with chemical models for practical fuels that help us better understand combustion. Therefore, we can design new methods to minimize carbon emissions,” she adds.

    While Deng is utilizing simulations and machine learning at the molecular level to improve designs, others are taking a more macro approach.

    Designing intelligent systems

    When it comes to intelligent design, Navid Azizan thinks big. He hopes to help create future intelligent systems that are capable of making decisions autonomously by using the enormous amounts of data emerging from the physical world. From smart robots and autonomous vehicles to smart power grids and smart cities, Azizan focuses on the analysis, design, and control of intelligent systems.

    Achieving such massive feats takes a truly interdisciplinary approach that draws upon various fields such as machine learning, dynamical systems, control, optimization, statistics, and network science, among others.

    “Developing intelligent systems is a multifaceted problem, and it really requires a confluence of disciplines,” says Azizan, assistant professor of mechanical engineering with a dual appointment in MIT’s Institute for Data, Systems, and Society (IDSS). “To create such systems, we need to go beyond standard approaches to machine learning, such as those commonly used in computer vision, and devise algorithms that can enable safe, efficient, real-time decision-making for physical systems.”

    For robot control to work in the complex dynamic environments that arise in the real world, real-time adaptation is key. If, for example, an autonomous vehicle is going to drive in icy conditions or a drone is operating in windy conditions, they need to be able to adapt to their new environment quickly.

    To address this challenge, Azizan and his collaborators at MIT and Stanford University have developed a new algorithm that combines adaptive control, a powerful methodology from control theory, with meta learning, a new machine learning paradigm.

    “This ‘control-oriented’ learning approach outperforms the existing ‘regression-oriented’ methods, which are mostly focused on just fitting the data, by a wide margin,” says Azizan.

    Another critical aspect of deploying machine learning algorithms in physical systems that Azizan and his team hope to address is safety. Deep neural networks are a crucial part of autonomous systems. They are used for interpreting complex visual inputs and making data-driven predictions of future behavior in real time. However, Azizan urges caution.

    “These deep neural networks are only as good as their training data, and their predictions can often be untrustworthy in scenarios not covered by their training data,” he says. Making decisions based on such untrustworthy predictions could lead to fatal accidents in autonomous vehicles or other safety-critical systems.

    To avoid these potentially catastrophic events, Azizan proposes that it is imperative to equip neural networks with a measure of their uncertainty. When the uncertainty is high, they can then be switched to a “safe policy.”

    In pursuit of this goal, Azizan and his collaborators have developed a new algorithm known as SCOD — Sketching Curvature of Out-of-Distribution Detection. This framework could be embedded within any deep neural network to equip them with a measure of their uncertainty.

    “This algorithm is model-agnostic and can be applied to neural networks used in various kinds of autonomous systems, whether it’s drones, vehicles, or robots,” says Azizan.

    Azizan hopes to continue working on algorithms for even larger-scale systems. He and his team are designing efficient algorithms to better control supply and demand in smart energy grids. According to Azizan, even if we create the most efficient solar panels and batteries, we can never achieve a sustainable grid powered by renewable resources without the right control mechanisms.

    Mechanical engineers like Ahmed, Mueller, Deng, and Azizan serve as the key to realizing the next revolution of computing in design.

    “MechE is in a unique position at the intersection of the computational and physical worlds,” Azizan says. “Mechanical engineers build a bridge between theoretical, algorithmic tools and real, physical world applications.”

    Sophisticated computational tools, coupled with the ground truth mechanical engineers have in the physical world, could unlock limitless possibilities for design engineering, well beyond what could have been imagined in those early days of CAD. More

  • in

    At UN climate change conference, trying to “keep 1.5 alive”

    After a one-year delay caused by the Covid-19 pandemic, negotiators from nearly 200 countries met this month in Glasgow, Scotland, at COP26, the United Nations climate change conference, to hammer out a new global agreement to reduce greenhouse gas emissions and prepare for climate impacts. A delegation of approximately 20 faculty, staff, and students from MIT was on hand to observe the negotiations, share and conduct research, and launch new initiatives.

    On Saturday, Nov. 13, following two weeks of negotiations in the cavernous Scottish Events Campus, countries’ representatives agreed to the Glasgow Climate Pact. The pact reaffirms the goal of the 2015 Paris Agreement “to pursue efforts” to limit the global average temperature increase to 1.5 degrees Celsius above preindustrial levels, and recognizes that achieving this goal requires “reducing global carbon dioxide emissions by 45 percent by 2030 relative to the 2010 level and to net zero around mid-century.”

    “On issues like the need to reach net-zero emissions, reduce methane pollution, move beyond coal power, and tighten carbon accounting rules, the Glasgow pact represents some meaningful progress, but we still have so much work to do,” says Maria Zuber, MIT’s vice president for research, who led the Institute’s delegation to COP26. “Glasgow showed, once again, what a wicked complex problem climate change is, technically, economically, and politically. But it also underscored the determination of a global community of people committed to addressing it.”

    An “ambition gap”

    Both within the conference venue and at protests that spilled through the streets of Glasgow, one rallying cry was “keep 1.5 alive.” Alok Sharma, who was appointed by the UK government to preside over COP26, said in announcing the Glasgow pact: “We can now say with credibility that we have kept 1.5 degrees alive. But, its pulse is weak and it will only survive if we keep our promises and translate commitments into rapid action.”

    In remarks delivered during the first week of the conference, Sergey Paltsev, deputy director of MIT’s Joint Program on the Science and Policy of Global Change, presented findings from the latest MIT Global Change Outlook, which showed a wide gap between countries’ nationally determined contributions (NDCs) — the UN’s term for greenhouse gas emissions reduction pledges — and the reductions needed to put the world on track to meet the goals of the Paris Agreement and, now, the Glasgow pact.

    Pointing to this ambition gap, Paltsev called on all countries to do more, faster, to cut emissions. “We could dramatically reduce overall climate risk through more ambitious policy measures and investments,” says Paltsev. “We need to employ an integrated approach of moving to zero emissions in energy and industry, together with sustainable development and nature-based solutions, simultaneously improving human well-being and providing biodiversity benefits.”

    Finalizing the Paris rulebook

    A key outcome of COP26 (COP stands for “conference of the parties” to the UN Framework Convention on Climate Change, held for the 26th time) was the development of a set of rules to implement Article 6 of the Paris Agreement, which provides a mechanism for countries to receive credit for emissions reductions that they finance outside their borders, and to cooperate by buying and selling emissions reductions on international carbon markets.

    An agreement on this part of the Paris “rulebook” had eluded negotiators in the years since the Paris climate conference, in part because negotiators were concerned about how to prevent double-counting, wherein both buyers and sellers would claim credit for the emissions reductions.

    Michael Mehling, the deputy director of MIT’s Center for Energy and Environmental Policy Research (CEEPR) and an expert on international carbon markets, drew on a recent CEEPR working paper to describe critical negotiation issues under Article 6 during an event at the conference on Nov. 10 with climate negotiators and private sector representatives.

    He cited research that finds that Article 6, by leveraging the cost-efficiency of global carbon markets, could cut in half the cost that countries would incur to achieve their nationally determined contributions. “Which, seen from another angle, means you could double the ambition of these NDCs at no additional cost,” Mehling noted in his talk, adding that, given the persistent ambition gap, “any such opportunity is bitterly needed.”

    Andreas Haupt, a graduate student in the Institute for Data, Systems, and Society, joined MIT’s COP26 delegation to follow Article 6 negotiations. Haupt described the final days of negotiations over Article 6 as a “roller coaster.” Once negotiators reached an agreement, he says, “I felt relieved, but also unsure how strong of an effect the new rules, with all their weaknesses, will have. I am curious and hopeful regarding what will happen in the next year until the next large-scale negotiations in 2022.”

    Nature-based climate solutions

    World leaders also announced new agreements on the sidelines of the formal UN negotiations. One such agreement, a declaration on forests signed by more than 100 countries, commits to “working collectively to halt and reverse forest loss and land degradation by 2030.”

    A team from MIT’s Environmental Solutions Initiative (ESI), which has been working with policymakers and other stakeholders on strategies to protect tropical forests and advance other nature-based climate solutions in Latin America, was at COP26 to discuss their work and make plans for expanding it.

    Marcela Angel, a research associate at ESI, moderated a panel discussion featuring John Fernández, professor of architecture and ESI’s director, focused on protecting and enhancing natural carbon sinks, particularly tropical forests such as the Amazon that are at risk of deforestation, forest degradation, and biodiversity loss.

    “Deforestation and associated land use change remain one of the main sources of greenhouse gas emissions in most Amazonian countries, such as Brazil, Peru, and Colombia,” says Angel. “Our aim is to support these countries, whose nationally determined contributions depend on the effectiveness of policies to prevent deforestation and promote conservation, with an approach based on the integration of targeted technology breakthroughs, deep community engagement, and innovative bioeconomic opportunities for local communities that depend on forests for their livelihoods.”

    Energy access and renewable energy

    Worldwide, an estimated 800 million people lack access to electricity, and billions more have only limited or erratic electrical service. Providing universal access to energy is one of the UN’s sustainable development goals, creating a dual challenge: how to boost energy access without driving up greenhouse gas emissions.

    Rob Stoner, deputy director for science and technology of the MIT Energy Initiative (MITEI), and Ignacio Pérez-Arriaga, a visiting professor at the Sloan School of Management, attended COP26 to share their work as members of the Global Commission to End Energy Poverty, a collaboration between MITEI and the Rockefeller Foundation. It brings together global energy leaders from industry, the development finance community, academia, and civil society to identify ways to overcome barriers to investment in the energy sectors of countries with low energy access.

    The commission’s work helped to motivate the formation, announced at COP26 on Nov. 2, of the Global Energy Alliance for People and Planet, a multibillion-dollar commitment by the Rockefeller and IKEA foundations and Bezos Earth Fund to support access to renewable energy around the world.

    Another MITEI member of the COP26 delegation, Martha Broad, the initiative’s executive director, spoke about MIT research to inform the U.S. goal of scaling offshore wind energy capacity from approximately 30 megawatts today to 30 gigawatts by 2030, including significant new capacity off the coast of New England.

    Broad described research, funded by MITEI member companies, on a coating that can be applied to the blades of wind turbines to prevent icing that would require the turbines’ shutdown; the use of machine learning to inform preventative turbine maintenance; and methodologies for incorporating the effects of climate change into projections of future wind conditions to guide wind farm siting decisions today. She also spoke broadly about the need for public and private support to scale promising innovations.

    “Clearly, both the public sector and the private sector have a role to play in getting these technologies to the point where we can use them in New England, and also where we can deploy them affordably for the developing world,” Broad said at an event sponsored by America Is All In, a coalition of nonprofit and business organizations.

    Food and climate alliance

    Food systems around the world are increasingly at risk from the impacts of climate change. At the same time, these systems, which include all activities from food production to consumption and food waste, are responsible for about one-third of the human-caused greenhouse gas emissions warming the planet.

    At COP26, MIT’s Abdul Latif Jameel Water and Food Systems Lab announced the launch of a new alliance to drive research-based innovation that will make food systems more resilient and sustainable, called the Food and Climate Systems Transformation (FACT) Alliance. With 16 member institutions, the FACT Alliance will better connect researchers to farmers, food businesses, policymakers, and other food systems stakeholders around the world.

    Looking ahead

    By the end of 2022, the Glasgow pact asks countries to revisit their nationally determined contributions and strengthen them to bring them in line with the temperature goals of the Paris Agreement. The pact also “notes with deep regret” the failure of wealthier countries to collectively provide poorer countries $100 billion per year in climate financing that they pledged in 2009 to begin in 2020.

    These and other issues will be on the agenda for COP27, to be held in Sharm El-Sheikh, Egypt, next year.

    “Limiting warming to 1.5 degrees is broadly accepted as a critical goal to avoiding worsening climate consequences, but it’s clear that current national commitments will not get us there,” says ESI’s Fernández. “We will need stronger emissions reductions pledges, especially from the largest greenhouse gas emitters. At the same time, expanding creativity, innovation, and determination from every sector of society, including research universities, to get on with real-world solutions is essential. At Glasgow, MIT was front and center in energy systems, cities, nature-based solutions, and more. The year 2030 is right around the corner so we can’t afford to let up for one minute.” More

  • in

    Theoretical breakthrough could boost data storage

    A trio of researchers that includes William Kuszmaul — a computer science PhD student at MIT — has made a discovery that could lead to more efficient data storage and retrieval in computers.

    The team’s findings relate to so-called “linear-probing hash tables,” which were introduced in 1954 and are among the oldest, simplest, and fastest data structures available today. Data structures provide ways of organizing and storing data in computers, with hash tables being one of the most commonly utilized approaches. In a linear-probing hash table, the positions in which information can be stored lie along a linear array.

    Suppose, for instance, that a database is designed to store the Social Security numbers of 10,000 people, Kuszmaul suggests. “We take your Social Security number, x, and we’ll then compute the hash function of x, h(x), which gives you a random number between one and 10,000.” The next step is to take that random number, h(x), go to that position in the array, and put x, the Social Security number, into that spot.

    If there’s already something occupying that spot, Kuszmaul says, “you just move forward to the next free position and put it there. This is where the term ‘linear probing’ comes from, as you keep moving forward linearly until you find an open spot.” In order to later retrieve that Social Security number, x, you just go to the designated spot, h(x), and if it’s not there, you move forward until you either find x or come to a free position and conclude that x is not in your database.

    There’s a somewhat different protocol for deleting an item, such as a Social Security number. If you just left an empty spot in the hash table after deleting the information, that could cause confusion when you later tried to find something else, as the vacant spot might erroneously suggest that the item you’re looking for is nowhere to be found in the database. To avoid that problem, Kuszmaul explains, “you can go to the spot where the element was removed and put a little marker there called a ‘tombstone,’ which indicates there used to be an element here, but it’s gone now.”

    This general procedure has been followed for more than half-a-century. But in all that time, almost everyone using linear-probing hash tables has assumed that if you allow them to get too full, long stretches of occupied spots would run together to form “clusters.” As a result, the time it takes to find a free spot would go up dramatically — quadratically, in fact — taking so long as to be impractical. Consequently, people have been trained to operate hash tables at low capacity — a practice that can exact an economic toll by affecting the amount of hardware a company has to purchase and maintain.

    But this time-honored principle, which has long militated against high load factors, has been totally upended by the work of Kuszmaul and his colleagues, Michael Bender of Stony Brook University and Bradley Kuszmaul of Google. They found that for applications where the number of insertions and deletions stays about the same — and the amount of data added is roughly equal to that removed — linear-probing hash tables can operate at high storage capacities without sacrificing speed.

    In addition, the team has devised a new strategy, called “graveyard hashing,” which involves artificially increasing the number of tombstones placed in an array until they occupy about half the free spots. These tombstones then reserve spaces that can be used for future insertions.

    This approach, which runs contrary to what people have customarily been instructed to do, Kuszmaul says, “can lead to optimal performance in linear-probing hash tables.” Or, as he and his coauthors maintain in their paper, the “well-designed use of tombstones can completely change the … landscape of how linear probing behaves.”

    Kuszmaul wrote up these findings with Bender and Kuszmaul in a paper posted earlier this year that will be presented in February at the Foundations of Computer Science (FOCS) Symposium in Boulder, Colorado.

    Kuszmaul’s PhD thesis advisor, MIT computer science professor Charles E. Leiserson (who did not participate in this research), agrees with that assessment. “These new and surprising results overturn one of the oldest conventional wisdoms about hash table behavior,” Leiserson says. “The lessons will reverberate for years among theoreticians and practitioners alike.”

    As for translating their results into practice, Kuszmaul notes, “there are many considerations that go into building a hash table. Although we’ve advanced the story considerably from a theoretical standpoint, we’re just starting to explore the experimental side of things.” More

  • in

    MIT collaborates with Biogen on three-year, $7 million initiative to address climate, health, and equity

    MIT and Biogen have announced that they will collaborate with the goal to accelerate the science and action on climate change to improve human health. This collaboration is supported by a three-year, $7 million commitment from the company and the Biogen Foundation. The biotechnology company, headquartered in Cambridge, Massachusetts’ Kendall Square, discovers and develops therapies for people living with serious neurological diseases.

    “We have long believed it is imperative for Biogen to make the fight against climate change central to our long-term corporate responsibility commitments. Through this collaboration with MIT, we aim to identify and share innovative climate solutions that will deliver co-benefits for both health and equity,” says Michel Vounatsos, CEO of Biogen. “We are also proud to support the MIT Museum, which promises to make world-class science and education accessible to all, and honor Biogen co-founder Phillip A. Sharp with a dedication inside the museum that recognizes his contributions to its development.”

    Biogen and the Biogen Foundation are supporting research and programs across a range of areas at MIT.

    Advancing climate, health, and equity

    The first such effort involves new work within the MIT Joint Program on the Science and Policy of Global Change to establish a state-of-the-art integrated model of climate and health aimed at identifying targets that deliver climate and health co-benefits.

    “Evidence suggests that not all climate-related actions deliver equal health benefits, yet policymakers, planners, and stakeholders traditionally lack the tools to consider how decisions in one arena impact the other,” says C. Adam Schlosser, deputy director of the MIT Joint Program. “Biogen’s collaboration with the MIT Joint Program — and its support of a new distinguished Biogen Fellow who will develop the new climate/health model — will accelerate our efforts to provide decision-makers with these tools.”

    Biogen is also supporting the MIT Technology and Policy Program’s Research to Policy Engagement Initiative to infuse human health as a key new consideration in decision-making on the best pathways forward to address the global climate crisis, and bridge the knowledge-to-action gap by connecting policymakers, researchers, and diverse stakeholders. As part of this work, Biogen is underwriting a distinguished Biogen Fellow to advance new research on climate, health, and equity.

    “Our work with Biogen has allowed us to make progress on key questions that matter to human health and well-being under climate change,” says Noelle Eckley Selin, who directs the MIT Technology and Policy Program and is a professor in the MIT Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences. “Further, their support of the Research to Policy Engagement Initiative helps all of our research become more effective in making change.”

    In addition, Biogen has joined 13 other companies in the MIT Climate and Sustainability Consortium (MCSC), which is supporting faculty and student research and developing impact pathways that present a range of actionable steps that companies can take — within and across industries — to advance progress toward climate targets.

    “Biogen joining the MIT Climate and Sustainability Consortium represents our commitment to working with member companies across a diverse range of industries, an approach that aims to drive changes swift and broad enough to match the scale of the climate challenge,” says Jeremy Gregory, executive director of the MCSC. “We are excited to welcome a member from the biotechnology space and look forward to harnessing Biogen’s perspectives as we continue to collaborate and work together with the MIT community in exciting and meaningful ways.”

    Making world-class science and education available to MIT Museum visitors

    Support from Biogen will honor Nobel laureate, MIT Institute professor, and Biogen co-founder Phillip A. Sharp with a named space inside the new Kendall Square location of the MIT Museum, set to open in spring 2022. Biogen also is supporting one of the museum’s opening exhibitions, “Essential MIT,” with a section focused on solving real-world problems such as climate change. It is also providing programmatic support for the museum’s Life Sciences Maker Engagement Program.

    “Phil has provided fantastic support to the MIT Museum for more than a decade as an advisory board member and now as board chair, and he has been deeply involved in plans for the new museum at Kendall Square,” says John Durant, the Mark R. Epstein (Class of 1963) Director of the museum. “Seeing his name on the wall will be a constant reminder of his key role in this development, as well as a mark of our gratitude.”

    Inspiring and empowering the next generation of scientists

    Biogen funding is also being directed to engage the next generation of scientists through support for the Biogen-MIT Biotech in Action: Virtual Lab, a program designed to foster a love of science among diverse and under-served student populations.

    Biogen’s support is part of its Healthy Climate, Healthy Lives initiative, a $250 million, 20-year commitment to eliminate fossil fuels across its operations and collaborate with renowned institutions to advance the science of climate and health and support under-served communities. Additional support is provided by the Biogen Foundation to further its long-standing focus on providing students with equitable access to outstanding science education. More

  • in

    Avoiding shortcut solutions in artificial intelligence

    If your Uber driver takes a shortcut, you might get to your destination faster. But if a machine learning model takes a shortcut, it might fail in unexpected ways.

    In machine learning, a shortcut solution occurs when the model relies on a simple characteristic of a dataset to make a decision, rather than learning the true essence of the data, which can lead to inaccurate predictions. For example, a model might learn to identify images of cows by focusing on the green grass that appears in the photos, rather than the more complex shapes and patterns of the cows.  

    A new study by researchers at MIT explores the problem of shortcuts in a popular machine-learning method and proposes a solution that can prevent shortcuts by forcing the model to use more data in its decision-making.

    By removing the simpler characteristics the model is focusing on, the researchers force it to focus on more complex features of the data that it hadn’t been considering. Then, by asking the model to solve the same task two ways — once using those simpler features, and then also using the complex features it has now learned to identify — they reduce the tendency for shortcut solutions and boost the performance of the model.

    One potential application of this work is to enhance the effectiveness of machine learning models that are used to identify disease in medical images. Shortcut solutions in this context could lead to false diagnoses and have dangerous implications for patients.

    “It is still difficult to tell why deep networks make the decisions that they do, and in particular, which parts of the data these networks choose to focus upon when making a decision. If we can understand how shortcuts work in further detail, we can go even farther to answer some of the fundamental but very practical questions that are really important to people who are trying to deploy these networks,” says Joshua Robinson, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper.

    Robinson wrote the paper with his advisors, senior author Suvrit Sra, the Esther and Harold E. Edgerton Career Development Associate Professor in the Department of Electrical Engineering and Computer Science (EECS) and a core member of the Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems; and Stefanie Jegelka, the X-Consortium Career Development Associate Professor in EECS and a member of CSAIL and IDSS; as well as University of Pittsburgh assistant professor Kayhan Batmanghelich and PhD students Li Sun and Ke Yu. The research will be presented at the Conference on Neural Information Processing Systems in December. 

    The long road to understanding shortcuts

    The researchers focused their study on contrastive learning, which is a powerful form of self-supervised machine learning. In self-supervised machine learning, a model is trained using raw data that do not have label descriptions from humans. It can therefore be used successfully for a larger variety of data.

    A self-supervised learning model learns useful representations of data, which are used as inputs for different tasks, like image classification. But if the model takes shortcuts and fails to capture important information, these tasks won’t be able to use that information either.

    For example, if a self-supervised learning model is trained to classify pneumonia in X-rays from a number of hospitals, but it learns to make predictions based on a tag that identifies the hospital the scan came from (because some hospitals have more pneumonia cases than others), the model won’t perform well when it is given data from a new hospital.     

    For contrastive learning models, an encoder algorithm is trained to discriminate between pairs of similar inputs and pairs of dissimilar inputs. This process encodes rich and complex data, like images, in a way that the contrastive learning model can interpret.

    The researchers tested contrastive learning encoders with a series of images and found that, during this training procedure, they also fall prey to shortcut solutions. The encoders tend to focus on the simplest features of an image to decide which pairs of inputs are similar and which are dissimilar. Ideally, the encoder should focus on all the useful characteristics of the data when making a decision, Jegelka says.

    So, the team made it harder to tell the difference between the similar and dissimilar pairs, and found that this changes which features the encoder will look at to make a decision.

    “If you make the task of discriminating between similar and dissimilar items harder and harder, then your system is forced to learn more meaningful information in the data, because without learning that it cannot solve the task,” she says.

    But increasing this difficulty resulted in a tradeoff — the encoder got better at focusing on some features of the data but became worse at focusing on others. It almost seemed to forget the simpler features, Robinson says.

    To avoid this tradeoff, the researchers asked the encoder to discriminate between the pairs the same way it had originally, using the simpler features, and also after the researchers removed the information it had already learned. Solving the task both ways simultaneously caused the encoder to improve across all features.

    Their method, called implicit feature modification, adaptively modifies samples to remove the simpler features the encoder is using to discriminate between the pairs. The technique does not rely on human input, which is important because real-world data sets can have hundreds of different features that could combine in complex ways, Sra explains.

    From cars to COPD

    The researchers ran one test of this method using images of vehicles. They used implicit feature modification to adjust the color, orientation, and vehicle type to make it harder for the encoder to discriminate between similar and dissimilar pairs of images. The encoder improved its accuracy across all three features — texture, shape, and color — simultaneously.

    To see if the method would stand up to more complex data, the researchers also tested it with samples from a medical image database of chronic obstructive pulmonary disease (COPD). Again, the method led to simultaneous improvements across all features they evaluated.

    While this work takes some important steps forward in understanding the causes of shortcut solutions and working to solve them, the researchers say that continuing to refine these methods and applying them to other types of self-supervised learning will be key to future advancements.

    “This ties into some of the biggest questions about deep learning systems, like ‘Why do they fail?’ and ‘Can we know in advance the situations where your model will fail?’ There is still a lot farther to go if you want to understand shortcut learning in its full generality,” Robinson says.

    This research is supported by the National Science Foundation, National Institutes of Health, and the Pennsylvania Department of Health’s SAP SE Commonwealth Universal Research Enhancement (CURE) program. More