More stories

  • in

    New AI model could streamline operations in a robotic warehouse

    Hundreds of robots zip back and forth across the floor of a colossal robotic warehouse, grabbing items and delivering them to human workers for packing and shipping. Such warehouses are increasingly becoming part of the supply chain in many industries, from e-commerce to automotive production.

    However, getting 800 robots to and from their destinations efficiently while keeping them from crashing into each other is no easy task. It is such a complex problem that even the best path-finding algorithms struggle to keep up with the breakneck pace of e-commerce or manufacturing. 

    In a sense, these robots are like cars trying to navigate a crowded city center. So, a group of MIT researchers who use AI to mitigate traffic congestion applied ideas from that domain to tackle this problem.

    They built a deep-learning model that encodes important information about the warehouse, including the robots, planned paths, tasks, and obstacles, and uses it to predict the best areas of the warehouse to decongest to improve overall efficiency.

    Their technique divides the warehouse robots into groups, so these smaller groups of robots can be decongested faster with traditional algorithms used to coordinate robots. In the end, their method decongests the robots nearly four times faster than a strong random search method.

    In addition to streamlining warehouse operations, this deep learning approach could be used in other complex planning tasks, like computer chip design or pipe routing in large buildings.

    “We devised a new neural network architecture that is actually suitable for real-time operations at the scale and complexity of these warehouses. It can encode hundreds of robots in terms of their trajectories, origins, destinations, and relationships with other robots, and it can do this in an efficient manner that reuses computation across groups of robots,” says Cathy Wu, the Gilbert W. Winslow Career Development Assistant Professor in Civil and Environmental Engineering (CEE), and a member of a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society (IDSS).

    Wu, senior author of a paper on this technique, is joined by lead author Zhongxia Yan, a graduate student in electrical engineering and computer science. The work will be presented at the International Conference on Learning Representations.

    Robotic Tetris

    From a bird’s eye view, the floor of a robotic e-commerce warehouse looks a bit like a fast-paced game of “Tetris.”

    When a customer order comes in, a robot travels to an area of the warehouse, grabs the shelf that holds the requested item, and delivers it to a human operator who picks and packs the item. Hundreds of robots do this simultaneously, and if two robots’ paths conflict as they cross the massive warehouse, they might crash.

    Traditional search-based algorithms avoid potential crashes by keeping one robot on its course and replanning a trajectory for the other. But with so many robots and potential collisions, the problem quickly grows exponentially.

    “Because the warehouse is operating online, the robots are replanned about every 100 milliseconds. That means that every second, a robot is replanned 10 times. So, these operations need to be very fast,” Wu says.

    Because time is so critical during replanning, the MIT researchers use machine learning to focus the replanning on the most actionable areas of congestion — where there exists the most potential to reduce the total travel time of robots.

    Wu and Yan built a neural network architecture that considers smaller groups of robots at the same time. For instance, in a warehouse with 800 robots, the network might cut the warehouse floor into smaller groups that contain 40 robots each.

    Then, it predicts which group has the most potential to improve the overall solution if a search-based solver were used to coordinate trajectories of robots in that group.

    An iterative process, the overall algorithm picks the most promising robot group with the neural network, decongests the group with the search-based solver, then picks the next most promising group with the neural network, and so on.

    Considering relationships

    The neural network can reason about groups of robots efficiently because it captures complicated relationships that exist between individual robots. For example, even though one robot may be far away from another initially, their paths could still cross during their trips.

    The technique also streamlines computation by encoding constraints only once, rather than repeating the process for each subproblem. For instance, in a warehouse with 800 robots, decongesting a group of 40 robots requires holding the other 760 robots as constraints. Other approaches require reasoning about all 800 robots once per group in each iteration.

    Instead, the researchers’ approach only requires reasoning about the 800 robots once across all groups in each iteration.

    “The warehouse is one big setting, so a lot of these robot groups will have some shared aspects of the larger problem. We designed our architecture to make use of this common information,” she adds.

    They tested their technique in several simulated environments, including some set up like warehouses, some with random obstacles, and even maze-like settings that emulate building interiors.

    By identifying more effective groups to decongest, their learning-based approach decongests the warehouse up to four times faster than strong, non-learning-based approaches. Even when they factored in the additional computational overhead of running the neural network, their approach still solved the problem 3.5 times faster.

    In the future, the researchers want to derive simple, rule-based insights from their neural model, since the decisions of the neural network can be opaque and difficult to interpret. Simpler, rule-based methods could also be easier to implement and maintain in actual robotic warehouse settings.

    “This approach is based on a novel architecture where convolution and attention mechanisms interact effectively and efficiently. Impressively, this leads to being able to take into account the spatiotemporal component of the constructed paths without the need of problem-specific feature engineering. The results are outstanding: Not only is it possible to improve on state-of-the-art large neighborhood search methods in terms of quality of the solution and speed, but the model generalizes to unseen cases wonderfully,” says Andrea Lodi, the Andrew H. and Ann R. Tisch Professor at Cornell Tech, and who was not involved with this research.

    This work was supported by Amazon and the MIT Amazon Science Hub. More

  • in

    Automated method helps researchers quantify uncertainty in their predictions

    Pollsters trying to predict presidential election results and physicists searching for distant exoplanets have at least one thing in common: They often use a tried-and-true scientific technique called Bayesian inference.

    Bayesian inference allows these scientists to effectively estimate some unknown parameter — like the winner of an election — from data such as poll results. But Bayesian inference can be slow, sometimes consuming weeks or even months of computation time or requiring a researcher to spend hours deriving tedious equations by hand. 

    Researchers from MIT and elsewhere have introduced an optimization technique that speeds things up without requiring a scientist to do a lot of additional work. Their method can achieve more accurate results faster than another popular approach for accelerating Bayesian inference.

    Using this new automated technique, a scientist could simply input their model and then the optimization method does all the calculations under the hood to provide an approximation of some unknown parameter. The method also offers reliable uncertainty estimates that can help a researcher understand when to trust its predictions.

    This versatile technique could be applied to a wide array of scientific quandaries that incorporate Bayesian inference. For instance, it could be used by economists studying the impact of microcredit loans in developing nations or sports analysts using a model to rank top tennis players.

    “When you actually dig into what people are doing in the social sciences, physics, chemistry, or biology, they are often using a lot of the same tools under the hood. There are so many Bayesian analyses out there. If we can build a really great tool that makes these researchers lives easier, then we can really make a difference to a lot of people in many different research areas,” says senior author Tamara Broderick, an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS) and a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society.

    Broderick is joined on the paper by co-lead authors Ryan Giordano, an assistant professor of statistics at the University of California at Berkeley; and Martin Ingram, a data scientist at the AI company KONUX. The paper was recently published in the Journal of Machine Learning Research.

    Faster results

    When researchers seek a faster form of Bayesian inference, they often turn to a technique called automatic differentiation variational inference (ADVI), which is often both fast to run and easy to use.

    But Broderick and her collaborators have found a number of practical issues with ADVI. It has to solve an optimization problem and can do so only approximately. So, ADVI can still require a lot of computation time and user effort to determine whether the approximate solution is good enough. And once it arrives at a solution, it tends to provide poor uncertainty estimates.

    Rather than reinventing the wheel, the team took many ideas from ADVI but turned them around to create a technique called deterministic ADVI (DADVI) that doesn’t have these downsides.

    With DADVI, it is very clear when the optimization is finished, so a user won’t need to spend extra computation time to ensure that the best solution has been found. DADVI also permits the incorporation of more powerful optimization methods that give it an additional speed and performance boost.

    Once it reaches a result, DADVI is set up to allow the use of uncertainty corrections. These corrections make its uncertainty estimates much more accurate than those of ADVI.

    DADVI also enables the user to clearly see how much error they have incurred in the approximation to the optimization problem. This prevents a user from needlessly running the optimization again and again with more and more resources to try and reduce the error.

    “We wanted to see if we could live up to the promise of black-box inference in the sense of, once the user makes their model, they can just run Bayesian inference and don’t have to derive everything by hand, they don’t need to figure out when to stop their algorithm, and they have a sense of how accurate their approximate solution is,” Broderick says.

    Defying conventional wisdom

    DADVI can be more effective than ADVI because it uses an efficient approximation method, called sample average approximation, which estimates an unknown quantity by taking a series of exact steps.

    Because the steps along the way are exact, it is clear when the objective has been reached. Plus, getting to that objective typically requires fewer steps.

    Often, researchers expect sample average approximation to be more computationally intensive than a more popular method, known as stochastic gradient, which is used by ADVI. But Broderick and her collaborators showed that, in many applications, this is not the case.

    “A lot of problems really do have special structure, and you can be so much more efficient and get better performance by taking advantage of that special structure. That is something we have really seen in this paper,” she adds.

    They tested DADVI on a number of real-world models and datasets, including a model used by economists to evaluate the effectiveness of microcredit loans and one used in ecology to determine whether a species is present at a particular site.

    Across the board, they found that DADVI can estimate unknown parameters faster and more reliably than other methods, and achieves as good or better accuracy than ADVI. Because it is easier to use than other techniques, DADVI could offer a boost to scientists in a wide variety of fields.

    In the future, the researchers want to dig deeper into correction methods for uncertainty estimates so they can better understand why these corrections can produce such accurate uncertainties, and when they could fall short.

    “In applied statistics, we often have to use approximate algorithms for problems that are too complex or high-dimensional to allow exact solutions to be computed in reasonable time. This new paper offers an interesting set of theory and empirical results that point to an improvement in a popular existing approximate algorithm for Bayesian inference,” says Andrew Gelman ’85, ’86, a professor of statistics and political science at Columbia University, who was not involved with the study. “As one of the team involved in the creation of that earlier work, I’m happy to see our algorithm superseded by something more stable.”

    This research was supported by a National Science Foundation CAREER Award and the U.S. Office of Naval Research.  More

  • in

    MIT researchers remotely map crops, field by field

    Crop maps help scientists and policymakers track global food supplies and estimate how they might shift with climate change and growing populations. But getting accurate maps of the types of crops that are grown from farm to farm often requires on-the-ground surveys that only a handful of countries have the resources to maintain.

    Now, MIT engineers have developed a method to quickly and accurately label and map crop types without requiring in-person assessments of every single farm. The team’s method uses a combination of Google Street View images, machine learning, and satellite data to automatically determine the crops grown throughout a region, from one fraction of an acre to the next. 

    The researchers used the technique to automatically generate the first nationwide crop map of Thailand — a smallholder country where small, independent farms make up the predominant form of agriculture. The team created a border-to-border map of Thailand’s four major crops — rice, cassava, sugarcane, and maize — and determined which of the four types was grown, at every 10 meters, and without gaps, across the entire country. The resulting map achieved an accuracy of 93 percent, which the researchers say is comparable to on-the-ground mapping efforts in high-income, big-farm countries.

    The team is applying their mapping technique to other countries such as India, where small farms sustain most of the population but the type of crops grown from farm to farm has historically been poorly recorded.

    “It’s a longstanding gap in knowledge about what is grown around the world,” says Sherrie Wang, the d’Arbeloff Career Development Assistant Professor in MIT’s Department of Mechanical Engineering, and the Institute for Data, Systems, and Society (IDSS). “The final goal is to understand agricultural outcomes like yield, and how to farm more sustainably. One of the key preliminary steps is to map what is even being grown — the more granularly you can map, the more questions you can answer.”

    Wang, along with MIT graduate student Jordi Laguarta Soler and Thomas Friedel of the agtech company PEAT GmbH, will present a paper detailing their mapping method later this month at the AAAI Conference on Artificial Intelligence.

    Ground truth

    Smallholder farms are often run by a single family or farmer, who subsist on the crops and livestock that they raise. It’s estimated that smallholder farms support two-thirds of the world’s rural population and produce 80 percent of the world’s food. Keeping tabs on what is grown and where is essential to tracking and forecasting food supplies around the world. But the majority of these small farms are in low to middle-income countries, where few resources are devoted to keeping track of individual farms’ crop types and yields.

    Crop mapping efforts are mainly carried out in high-income regions such as the United States and Europe, where government agricultural agencies oversee crop surveys and send assessors to farms to label crops from field to field. These “ground truth” labels are then fed into machine-learning models that make connections between the ground labels of actual crops and satellite signals of the same fields. They then label and map wider swaths of farmland that assessors don’t cover but that satellites automatically do.

    “What’s lacking in low- and middle-income countries is this ground label that we can associate with satellite signals,” Laguarta Soler says. “Getting these ground truths to train a model in the first place has been limited in most of the world.”

    The team realized that, while many developing countries do not have the resources to maintain crop surveys, they could potentially use another source of ground data: roadside imagery, captured by services such as Google Street View and Mapillary, which send cars throughout a region to take continuous 360-degree images with dashcams and rooftop cameras.

    In recent years, such services have been able to access low- and middle-income countries. While the goal of these services is not specifically to capture images of crops, the MIT team saw that they could search the roadside images to identify crops.

    Cropped image

    In their new study, the researchers worked with Google Street View (GSV) images taken throughout Thailand — a country that the service has recently imaged fairly thoroughly, and which consists predominantly of smallholder farms.

    Starting with over 200,000 GSV images randomly sampled across Thailand, the team filtered out images that depicted buildings, trees, and general vegetation. About 81,000 images were crop-related. They set aside 2,000 of these, which they sent to an agronomist, who determined and labeled each crop type by eye. They then trained a convolutional neural network to automatically generate crop labels for the other 79,000 images, using various training methods, including iNaturalist — a web-based crowdsourced  biodiversity database, and GPT-4V, a “multimodal large language model” that enables a user to input an image and ask the model to identify what the image is depicting. For each of the 81,000 images, the model generated a label of one of four crops that the image was likely depicting — rice, maize, sugarcane, or cassava.

    The researchers then paired each labeled image with the corresponding satellite data taken of the same location throughout a single growing season. These satellite data include measurements across multiple wavelengths, such as a location’s greenness and its reflectivity (which can be a sign of water). 

    “Each type of crop has a certain signature across these different bands, which changes throughout a growing season,” Laguarta Soler notes.

    The team trained a second model to make associations between a location’s satellite data and its corresponding crop label. They then used this model to process satellite data taken of the rest of the country, where crop labels were not generated or available. From the associations that the model learned, it then assigned crop labels across Thailand, generating a country-wide map of crop types, at a resolution of 10 square meters.

    This first-of-its-kind crop map included locations corresponding to the 2,000 GSV images that the researchers originally set aside, that were labeled by arborists. These human-labeled images were used to validate the map’s labels, and when the team looked to see whether the map’s labels matched the expert, “gold standard” labels, it did so 93 percent of the time.

    “In the U.S., we’re also looking at over 90 percent accuracy, whereas with previous work in India, we’ve only seen 75 percent because ground labels are limited,” Wang says. “Now we can create these labels in a cheap and automated way.”

    The researchers are moving to map crops across India, where roadside images via Google Street View and other services have recently become available.

    “There are over 150 million smallholder farmers in India,” Wang says. “India is covered in agriculture, almost wall-to-wall farms, but very small farms, and historically it’s been very difficult to create maps of India because there are very sparse ground labels.”

    The team is working to generate crop maps in India, which could be used to inform policies having to do with assessing and bolstering yields, as global temperatures and populations rise.

    “What would be interesting would be to create these maps over time,” Wang says. “Then you could start to see trends, and we can try to relate those things to anything like changes in climate and policies.” More

  • in

    Study: Global deforestation leads to more mercury pollution

    About 10 percent of human-made mercury emissions into the atmosphere each year are the result of global deforestation, according to a new MIT study.

    The world’s vegetation, from the Amazon rainforest to the savannahs of sub-Saharan Africa, acts as a sink that removes the toxic pollutant from the air. However, if the current rate of deforestation remains unchanged or accelerates, the researchers estimate that net mercury emissions will keep increasing.

    “We’ve been overlooking a significant source of mercury, especially in tropical regions,” says Ari Feinberg, a former postdoc in the Institute for Data, Systems, and Society (IDSS) and lead author of the study.

    The researchers’ model shows that the Amazon rainforest plays a particularly important role as a mercury sink, contributing about 30 percent of the global land sink. Curbing Amazon deforestation could thus have a substantial impact on reducing mercury pollution.

    The team also estimates that global reforestation efforts could increase annual mercury uptake by about 5 percent. While this is significant, the researchers emphasize that reforestation alone should not be a substitute for worldwide pollution control efforts.

    “Countries have put a lot of effort into reducing mercury emissions, especially northern industrialized countries, and for very good reason. But 10 percent of the global anthropogenic source is substantial, and there is a potential for that to be even greater in the future. [Addressing these deforestation-related emissions] needs to be part of the solution,” says senior author Noelle Selin, a professor in IDSS and MIT’s Department of Earth, Atmospheric and Planetary Sciences.

    Feinberg and Selin are joined on the paper by co-authors Martin Jiskra, a former Swiss National Science Foundation Ambizione Fellow at the University of Basel; Pasquale Borrelli, a professor at Roma Tre University in Italy; and Jagannath Biswakarma, a postdoc at the Swiss Federal Institute of Aquatic Science and Technology. The paper appears today in Environmental Science and Technology.

    Modeling mercury

    Over the past few decades, scientists have generally focused on studying deforestation as a source of global carbon dioxide emissions. Mercury, a trace element, hasn’t received the same attention, partly because the terrestrial biosphere’s role in the global mercury cycle has only recently been better quantified.

    Plant leaves take up mercury from the atmosphere, in a similar way as they take up carbon dioxide. But unlike carbon dioxide, mercury doesn’t play an essential biological function for plants. Mercury largely stays within a leaf until it falls to the forest floor, where the mercury is absorbed by the soil.

    Mercury becomes a serious concern for humans if it ends up in water bodies, where it can become methylated by microorganisms. Methylmercury, a potent neurotoxin, can be taken up by fish and bioaccumulated through the food chain. This can lead to risky levels of methylmercury in the fish humans eat.

    “In soils, mercury is much more tightly bound than it would be if it were deposited in the ocean. The forests are doing a sort of ecosystem service, in that they are sequestering mercury for longer timescales,” says Feinberg, who is now a postdoc in the Blas Cabrera Institute of Physical Chemistry in Spain.

    In this way, forests reduce the amount of toxic methylmercury in oceans.

    Many studies of mercury focus on industrial sources, like burning fossil fuels, small-scale gold mining, and metal smelting. A global treaty, the 2013 Minamata Convention, calls on nations to reduce human-made emissions. However, it doesn’t directly consider impacts of deforestation.

    The researchers launched their study to fill in that missing piece.

    In past work, they had built a model to probe the role vegetation plays in mercury uptake. Using a series of land use change scenarios, they adjusted the model to quantify the role of deforestation.

    Evaluating emissions

    This chemical transport model tracks mercury from its emissions sources to where it is chemically transformed in the atmosphere and then ultimately to where it is deposited, mainly through rainfall or uptake into forest ecosystems.

    They divided the Earth into eight regions and performed simulations to calculate deforestation emissions factors for each, considering elements like type and density of vegetation, mercury content in soils, and historical land use.

    However, good data for some regions were hard to come by.

    They lacked measurements from tropical Africa or Southeast Asia — two areas that experience heavy deforestation. To get around this gap, they used simpler, offline models to simulate hundreds of scenarios, which helped them improve their estimations of potential uncertainties.

    They also developed a new formulation for mercury emissions from soil. This formulation captures the fact that deforestation reduces leaf area, which increases the amount of sunlight that hits the ground and accelerates the outgassing of mercury from soils.

    The model divides the world into grid squares, each of which is a few hundred square kilometers. By changing land surface and vegetation parameters in certain squares to represent deforestation and reforestation scenarios, the researchers can capture impacts on the mercury cycle.

    Overall, they found that about 200 tons of mercury are emitted to the atmosphere as the result of deforestation, or about 10 percent of total human-made emissions. But in tropical and sub-tropical countries, deforestation emissions represent a higher percentage of total emissions. For example, in Brazil deforestation emissions are 40 percent of total human-made emissions.

    In addition, people often light fires to prepare tropical forested areas for agricultural activities, which causes more emissions by releasing mercury stored by vegetation.

    “If deforestation was a country, it would be the second highest emitting country, after China, which emits around 500 tons of mercury a year,” Feinberg adds.

    And since the Minamata Convention is now addressing primary mercury emissions, scientists can expect deforestation to become a larger fraction of human-made emissions in the future.

    “Policies to protect forests or cut them down have unintended effects beyond their target. It is important to consider the fact that these are systems, and they involve human activities, and we need to understand them better in order to actually solve the problems that we know are out there,” Selin says.

    By providing this first estimate, the team hopes to inspire more research in this area.

    In the future, they want to incorporate more dynamic Earth system models into their analysis, which would enable them to interactively track mercury uptake and better model the timescale of vegetation regrowth.

    “This paper represents an important advance in our understanding of global mercury cycling by quantifying a pathway that has long been suggested but not yet quantified. Much of our research to date has focused on primary anthropogenic emissions — those directly resulting from human activity via coal combustion or mercury-gold amalgam burning in artisanal and small-scale gold mining,” says Jackie Gerson, an assistant professor in the Department of Earth and Environmental Sciences at Michigan State University, who was not involved with this research. “This research shows that deforestation can also result in substantial mercury emissions and needs to be considered both in terms of global mercury models and land management policies. It therefore has the potential to advance our field scientifically as well as to promote policies that reduce mercury emissions via deforestation.

    This work was funded, in part, by the U.S. National Science Foundation, the Swiss National Science Foundation, and Swiss Federal Institute of Aquatic Science and Technology. More

  • in

    How symmetry can come to the aid of machine learning

    Behrooz Tahmasebi — an MIT PhD student in the Department of Electrical Engineering and Computer Science (EECS) and an affiliate of the Computer Science and Artificial Intelligence Laboratory (CSAIL) — was taking a mathematics course on differential equations in late 2021 when a glimmer of inspiration struck. In that class, he learned for the first time about Weyl’s law, which had been formulated 110 years earlier by the German mathematician Hermann Weyl. Tahmasebi realized it might have some relevance to the computer science problem he was then wrestling with, even though the connection appeared — on the surface — to be thin, at best. Weyl’s law, he says, provides a formula that measures the complexity of the spectral information, or data, contained within the fundamental frequencies of a drum head or guitar string.

    Tahmasebi was, at the same time, thinking about measuring the complexity of the input data to a neural network, wondering whether that complexity could be reduced by taking into account some of the symmetries inherent to the dataset. Such a reduction, in turn, could facilitate — as well as speed up — machine learning processes.

    Weyl’s law, conceived about a century before the boom in machine learning, had traditionally been applied to very different physical situations — such as those concerning the vibrations of a string or the spectrum of electromagnetic (black-body) radiation given off by a heated object. Nevertheless, Tahmasebi believed that a customized version of that law might help with the machine learning problem he was pursuing. And if the approach panned out, the payoff could be considerable.

    He spoke with his advisor, Stefanie Jegelka — an associate professor in EECS and affiliate of CSAIL and the MIT Institute for Data, Systems, and Society — who believed the idea was definitely worth looking into. As Tahmasebi saw it, Weyl’s law had to do with gauging the complexity of data, and so did this project. But Weyl’s law, in its original form, said nothing about symmetry.

    He and Jegelka have now succeeded in modifying Weyl’s law so that symmetry can be factored into the assessment of a dataset’s complexity. “To the best of my knowledge,” Tahmasebi says, “this is the first time Weyl’s law has been used to determine how machine learning can be enhanced by symmetry.”

    The paper he and Jegelka wrote earned a “Spotlight” designation when it was presented at the December 2023 conference on Neural Information Processing Systems — widely regarded as the world’s top conference on machine learning.

    This work, comments Soledad Villar, an applied mathematician at Johns Hopkins University, “shows that models that satisfy the symmetries of the problem are not only correct but also can produce predictions with smaller errors, using a small amount of training points. [This] is especially important in scientific domains, like computational chemistry, where training data can be scarce.”

    In their paper, Tahmasebi and Jegelka explored the ways in which symmetries, or so-called “invariances,” could benefit machine learning. Suppose, for example, the goal of a particular computer run is to pick out every image that contains the numeral 3. That task can be a lot easier, and go a lot quicker, if the algorithm can identify the 3 regardless of where it is placed in the box — whether it’s exactly in the center or off to the side — and whether it is pointed right-side up, upside down, or oriented at a random angle. An algorithm equipped with the latter capability can take advantage of the symmetries of translation and rotations, meaning that a 3, or any other object, is not changed in itself by altering its position or by rotating it around an arbitrary axis. It is said to be invariant to those shifts. The same logic can be applied to algorithms charged with identifying dogs or cats. A dog is a dog is a dog, one might say, irrespective of how it is embedded within an image. 

    The point of the entire exercise, the authors explain, is to exploit a dataset’s intrinsic symmetries in order to reduce the complexity of machine learning tasks. That, in turn, can lead to a reduction in the amount of data needed for learning. Concretely, the new work answers the question: How many fewer data are needed to train a machine learning model if the data contain symmetries?

    There are two ways of achieving a gain, or benefit, by capitalizing on the symmetries present. The first has to do with the size of the sample to be looked at. Let’s imagine that you are charged, for instance, with analyzing an image that has mirror symmetry — the right side being an exact replica, or mirror image, of the left. In that case, you don’t have to look at every pixel; you can get all the information you need from half of the image — a factor of two improvement. If, on the other hand, the image can be partitioned into 10 identical parts, you can get a factor of 10 improvement. This kind of boosting effect is linear.

    To take another example, imagine you are sifting through a dataset, trying to find sequences of blocks that have seven different colors — black, blue, green, purple, red, white, and yellow. Your job becomes much easier if you don’t care about the order in which the blocks are arranged. If the order mattered, there would be 5,040 different combinations to look for. But if all you care about are sequences of blocks in which all seven colors appear, then you have reduced the number of things — or sequences — you are searching for from 5,040 to just one.

    Tahmasebi and Jegelka discovered that it is possible to achieve a different kind of gain — one that is exponential — that can be reaped for symmetries that operate over many dimensions. This advantage is related to the notion that the complexity of a learning task grows exponentially with the dimensionality of the data space. Making use of a multidimensional symmetry can therefore yield a disproportionately large return. “This is a new contribution that is basically telling us that symmetries of higher dimension are more important because they can give us an exponential gain,” Tahmasebi says. 

    The NeurIPS 2023 paper that he wrote with Jegelka contains two theorems that were proved mathematically. “The first theorem shows that an improvement in sample complexity is achievable with the general algorithm we provide,” Tahmasebi says. The second theorem complements the first, he added, “showing that this is the best possible gain you can get; nothing else is achievable.”

    He and Jegelka have provided a formula that predicts the gain one can obtain from a particular symmetry in a given application. A virtue of this formula is its generality, Tahmasebi notes. “It works for any symmetry and any input space.” It works not only for symmetries that are known today, but it could also be applied in the future to symmetries that are yet to be discovered. The latter prospect is not too farfetched to consider, given that the search for new symmetries has long been a major thrust in physics. That suggests that, as more symmetries are found, the methodology introduced by Tahmasebi and Jegelka should only get better over time.

    According to Haggai Maron, a computer scientist at Technion (the Israel Institute of Technology) and NVIDIA who was not involved in the work, the approach presented in the paper “diverges substantially from related previous works, adopting a geometric perspective and employing tools from differential geometry. This theoretical contribution lends mathematical support to the emerging subfield of ‘Geometric Deep Learning,’ which has applications in graph learning, 3D data, and more. The paper helps establish a theoretical basis to guide further developments in this rapidly expanding research area.” More

  • in

    New model predicts how shoe properties affect a runner’s performance

    A good shoe can make a huge difference for runners, from career marathoners to couch-to-5K first-timers. But every runner is unique, and a shoe that works for one might trip up another. Outside of trying on a rack of different designs, there’s no quick and easy way to know which shoe best suits a person’s particular running style.

    MIT engineers are hoping to change that with a new model that predicts how certain shoe properties will affect a runner’s performance.

    The simple model incorporates a person’s height, weight, and other general dimensions, along with shoe properties such as stiffness and springiness along the midsole. With this input, the model then simulates a person’s running gait, or how they would run, in a particular shoe.

    Play video

    Using the model, the researchers can simulate how a runner’s gait changes with different shoe types. They can then pick out the shoe that produces the best performance, which they define as the degree to which a runner’s expended energy is minimized.

    While the model can accurately simulate changes in a runner’s gait when comparing two very different shoe types, it is less discerning when comparing relatively similar designs, including most commercially available running shoes. For this reason, the researchers envision the current model would be best used as a tool for shoe designers looking to push the boundaries of sneaker design.

    “Shoe designers are starting to 3D print shoes, meaning they can now make them with a much wider range of properties than with just a regular slab of foam,” says Sarah Fay, a postdoc in MIT’s Sports Lab and the Institute for Data, Systems, and Society (IDSS). “Our model could help them design really novel shoes that are also high-performing.”

    The team is planning to improve the model, in hopes that consumers can one day use a similar version to pick shoes that fit their personal running style.

    “We’ve allowed for enough flexibility in the model that it can be used to design custom shoes and understand different individual behaviors,” Fay says. “Way down the road, we imagine that if you send us a video of yourself running, we could 3D print the shoe that’s right for you. That would be the moonshot.”

    The new model is reported in a study appearing this month in the Journal of Biomechanical Engineering. The study is authored by Fay and Anette “Peko” Hosoi, professor of mechanical engineering at MIT.

    Running, revamped

    The team’s new model grew out of talks with collaborators in the sneaker industry, where designers have started to 3D print shoes at commercial scale. These designs incorporate 3D-printed midsoles that resemble intricate scaffolds, the geometry of which can be tailored to give a certain bounce or stiffness in specific locations across the sole.

    “With 3D printing, designers can tune everything about the material response locally,” Hosoi says. “And they came to us and essentially said, ‘We can do all these things. What should we do?’”

    “Part of the design problem is to predict what a runner will do when you put an entirely new shoe on them,” Fay adds. “You have to couple the dynamics of the runner with the properties of the shoe.”

    Fay and Hosoi looked first to represent a runner’s dynamics using a simple model. They drew inspiration from Thomas McMahon, a leader in the study of biomechanics at Harvard University, who in the 1970s used a very simple “spring and damper” model to model a runner’s essential gait mechanics. Using this gait model, he predicted how fast a person could run on various track types, from traditional concrete surfaces to more rubbery material. The model showed that runners should run faster on softer, bouncier tracks that supported a runner’s natural gait.

    Though this may be unsurprising today, the insight was a revelation at the time, prompting Harvard to revamp its indoor track — a move that quickly accumulated track records, as runners found they could run much faster on the softier, springier surface.

    “McMahon’s work showed that, even if we don’t model every single limb and muscle and component of the human body, we’re still able to create meaningful insights in terms of how we design for athletic performance,” Fay says.

    Gait cost

    Following McMahon’s lead, Fay and Hosoi developed a similar, simplified model of a runner’s dynamics. The model represents a runner as a center of mass, with a hip that can rotate and a leg that can stretch. The leg is connected to a box-like shoe, with springiness and shock absorption that can be tuned, both vertically and horizontally.

    They reasoned that they should be able to input into the model a person’s basic dimensions, such as their height, weight, and leg length, along with a shoe’s material properties, such as the stiffness of the front and back midsole, and use the model to simulate what a person’s gait is likely to be when running in that shoe.

    But they also realized that a person’s gait can depend on a less definable property, which they call the “biological cost function” — a quality that a runner might not consciously be aware of but nevertheless may try to minimize whenever they run. The team reasoned that if they can identify a biological cost function that is general to most runners, then they might predict not only a person’s gait for a given shoe but also which shoe produces the gait corresponding to the best running performance.

    With this in mind, the team looked to a previous treadmill study, which recorded detailed measurements of runners, such as the force of their impacts, the angle and motion of their joints, the spring in their steps, and the work of their muscles as they ran, each in the same type of running shoe.

    Fay and Hosoi hypothesized that each runner’s actual gait arose not only from their personal dimensions and shoe properties, but also a subconscious goal to minimize one or more biological measures, yet unknown. To reveal these measures, the team used their model to simulate each runner’s gait multiple times. Each time, they programmed the model to assume the runner minimized a different biological cost, such as the degree to which they swing their leg or the impact that they make with the treadmill. They then compared the modeled gait with the runner’s actual gait to see which modeled gait — and assumed cost — matched the actual gait.

    In the end, the team found that most runners tend to minimize two costs: the impact their feet make with the treadmill and the amount of energy their legs expend.

    “If we tell our model, ‘Optimize your gait on these two things,’ it gives us really realistic-looking gaits that best match the data we have,” Fay explains. “This gives us confidence that the model can predict how people will actually run, even if we change their shoe.”

    As a final step, the researchers simulated a wide range of shoe styles and used the model to predict a runner’s gait and how efficient each gait would be for a given type of shoe.

    “In some ways, this gives you a quantitative way to design a shoe for a 10K versus a marathon shoe,” Hosoi says. “Designers have an intuitive sense for that. But now we have a mathematical understanding that we hope designers can use as a tool to kickstart new ideas.”

    This research is supported, in part, by adidas. More

  • in

    Blueprint Labs launches a charter school research collaborative

    Over the past 30 years, charter schools have emerged as a prominent yet debated public school option. According to the National Center for Education Statistics, 7 percent of U.S. public school students were enrolled in charter schools in 2021, up from 4 percent in 2010. Amid this expansion, families and policymakers want to know more about charter school performance and its systemic impacts. While researchers have evaluated charter schools’ short-term effects on student outcomes, significant knowledge gaps still exist. 

    MIT Blueprint Labs aims to fill those gaps through its Charter School Research Collaborative, an initiative that brings together practitioners, policymakers, researchers, and funders to make research on charter schools more actionable, rigorous, and efficient. The collaborative will create infrastructure to streamline and fund high-quality, policy-relevant charter research. 

    Joshua Angrist, MIT Ford Professor of Economics and a Blueprint Labs co-founder and director, says that Blueprint Labs hopes “to increase [its] impact by working with a larger group of academic and practitioner partners.” A nonpartisan research lab, Blueprint’s mission is to produce the most rigorous evidence possible to inform policy and practice. Angrist notes, “The debate over charter schools is not always fact-driven. Our goal at the lab is to bring convincing evidence into these discussions.”

    Collaborative kickoff

    The collaborative launched with a two-day kickoff in November. Blueprint Labs welcomed researchers, practitioners, funders, and policymakers to MIT to lay the groundwork for the collaborative. Over 80 participants joined the event, including leaders of charter school organizations, researchers at top universities and institutes, and policymakers and advocates from a variety of organizations and education agencies. 

    Through a series of panels, presentations, and conversations, participants including Rhode Island Department of Education Commissioner Angélica Infante-Green, CEO of Noble Schools Constance Jones, former Knowledge is Power Program CEO Richard Barth, president and CEO of National Association of Charter School Authorizers Karega Rausch, and many others discussed critical topics in the charter school space. These conversations influenced the collaborative’s research agenda. 

    Several sessions also highlighted how to ensure that the research process includes diverse voices to generate actionable evidence. Panelists noted that researchers should be aware of the demands placed on practitioners and should carefully consider community contexts. In addition, collaborators should treat each other as equal partners. 

    Parag Pathak, the Class of 1922 Professor of Economics at MIT and a Blueprint Labs co-founder and director, explained the kickoff’s aims. “One of our goals today is to begin to forge connections between [attendees]. We hope that [their] conversations are the launching point for future collaborations,” he stated. Pathak also shared the next steps for the collaborative: “Beginning next year, we’ll start investing in new research using the agenda [developed at this event] as our guide. We will also support new partnerships between researchers and practitioners.”

    Research agenda

    The discussions at the kickoff informed the collaborative’s research agenda. A recent paper summarizing existing lottery-based research on charter school effectiveness by Sarah Cohodes, an associate professor of public policy at the University of Michigan, and Susha Roy, an associate policy researcher at the RAND Corp., also guides the agenda. Their review finds that in randomized evaluations, many charter schools increase students’ academic achievement. However, researchers have not yet studied charter schools’ impacts on long-term, behavioral, or health outcomes in depth, and rigorous, lottery-based research is currently limited to a handful of urban centers. 

    The current research agenda focuses on seven topics:

    the long-term effects of charter schools;
    the effect of charters on non-test score outcomes;
    which charter school practices have the largest effect on performance;
    how charter performance varies across different contexts;
    how charter school effects vary with demographic characteristics and student background;
    how charter schools impact non-student outcomes, like teacher retention; and
    how system-level factors, such as authorizing practices, impact charter school performance.
    As diverse stakeholders’ priorities continue to shift and the collaborative progresses, the research agenda will continue to evolve.

    Information for interested partners

    Opportunities exist for charter leaders, policymakers, researchers, and funders to engage with the collaborative. Stakeholders can apply for funding, help shape the research agenda, and develop new research partnerships. A competitive funding process will open this month.

    Those interested in receiving updates on the collaborative can fill out this form. Please direct questions to chartercollab@mitblueprintlabs.org. More

  • in

    New hope for early pancreatic cancer intervention via AI-based risk prediction

    The first documented case of pancreatic cancer dates back to the 18th century. Since then, researchers have undertaken a protracted and challenging odyssey to understand the elusive and deadly disease. To date, there is no better cancer treatment than early intervention. Unfortunately, the pancreas, nestled deep within the abdomen, is particularly elusive for early detection. 

    MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) scientists, alongside Limor Appelbaum, a staff scientist in the Department of Radiation Oncology at Beth Israel Deaconess Medical Center (BIDMC), were eager to better identify potential high-risk patients. They set out to develop two machine-learning models for early detection of pancreatic ductal adenocarcinoma (PDAC), the most common form of the cancer. To access a broad and diverse database, the team synced up with a federated network company, using electronic health record data from various institutions across the United States. This vast pool of data helped ensure the models’ reliability and generalizability, making them applicable across a wide range of populations, geographical locations, and demographic groups.

    The two models — the “PRISM” neural network, and the logistic regression model (a statistical technique for probability), outperformed current methods. The team’s comparison showed that while standard screening criteria identify about 10 percent of PDAC cases using a five-times higher relative risk threshold, Prism can detect 35 percent of PDAC cases at this same threshold. 

    Using AI to detect cancer risk is not a new phenomena — algorithms analyze mammograms, CT scans for lung cancer, and assist in the analysis of Pap smear tests and HPV testing, to name a few applications. “The PRISM models stand out for their development and validation on an extensive database of over 5 million patients, surpassing the scale of most prior research in the field,” says Kai Jia, an MIT PhD student in electrical engineering and computer science (EECS), MIT CSAIL affiliate, and first author on an open-access paper in eBioMedicine outlining the new work. “The model uses routine clinical and lab data to make its predictions, and the diversity of the U.S. population is a significant advancement over other PDAC models, which are usually confined to specific geographic regions, like a few health-care centers in the U.S. Additionally, using a unique regularization technique in the training process enhanced the models’ generalizability and interpretability.” 

    “This report outlines a powerful approach to use big data and artificial intelligence algorithms to refine our approach to identifying risk profiles for cancer,” says David Avigan, a Harvard Medical School professor and the cancer center director and chief of hematology and hematologic malignancies at BIDMC, who was not involved in the study. “This approach may lead to novel strategies to identify patients with high risk for malignancy that may benefit from focused screening with the potential for early intervention.” 

    Prismatic perspectives

    The journey toward the development of PRISM began over six years ago, fueled by firsthand experiences with the limitations of current diagnostic practices. “Approximately 80-85 percent of pancreatic cancer patients are diagnosed at advanced stages, where cure is no longer an option,” says senior author Appelbaum, who is also a Harvard Medical School instructor as well as radiation oncologist. “This clinical frustration sparked the idea to delve into the wealth of data available in electronic health records (EHRs).”The CSAIL group’s close collaboration with Appelbaum made it possible to understand the combined medical and machine learning aspects of the problem better, eventually leading to a much more accurate and transparent model. “The hypothesis was that these records contained hidden clues — subtle signs and symptoms that could act as early warning signals of pancreatic cancer,” she adds. “This guided our use of federated EHR networks in developing these models, for a scalable approach for deploying risk prediction tools in health care.”Both PrismNN and PrismLR models analyze EHR data, including patient demographics, diagnoses, medications, and lab results, to assess PDAC risk. PrismNN uses artificial neural networks to detect intricate patterns in data features like age, medical history, and lab results, yielding a risk score for PDAC likelihood. PrismLR uses logistic regression for a simpler analysis, generating a probability score of PDAC based on these features. Together, the models offer a thorough evaluation of different approaches in predicting PDAC risk from the same EHR data.

    One paramount point for gaining the trust of physicians, the team notes, is better understanding how the models work, known in the field as interpretability. The scientists pointed out that while logistic regression models are inherently easier to interpret, recent advancements have made deep neural networks somewhat more transparent. This helped the team to refine the thousands of potentially predictive features derived from EHR of a single patient to approximately 85 critical indicators. These indicators, which include patient age, diabetes diagnosis, and an increased frequency of visits to physicians, are automatically discovered by the model but match physicians’ understanding of risk factors associated with pancreatic cancer. 

    The path forward

    Despite the promise of the PRISM models, as with all research, some parts are still a work in progress. U.S. data alone are the current diet for the models, necessitating testing and adaptation for global use. The path forward, the team notes, includes expanding the model’s applicability to international datasets and integrating additional biomarkers for more refined risk assessment.

    “A subsequent aim for us is to facilitate the models’ implementation in routine health care settings. The vision is to have these models function seamlessly in the background of health care systems, automatically analyzing patient data and alerting physicians to high-risk cases without adding to their workload,” says Jia. “A machine-learning model integrated with the EHR system could empower physicians with early alerts for high-risk patients, potentially enabling interventions well before symptoms manifest. We are eager to deploy our techniques in the real world to help all individuals enjoy longer, healthier lives.” 

    Jia wrote the paper alongside Applebaum and MIT EECS Professor and CSAIL Principal Investigator Martin Rinard, who are both senior authors of the paper. Researchers on the paper were supported during their time at MIT CSAIL, in part, by the Defense Advanced Research Projects Agency, Boeing, the National Science Foundation, and Aarno Labs. TriNetX provided resources for the project, and the Prevent Cancer Foundation also supported the team. More