More stories

  • in

    Putting AI into the hands of people with problems to solve

    As Media Lab students in 2010, Karthik Dinakar SM ’12, PhD ’17 and Birago Jones SM ’12 teamed up for a class project to build a tool that would help content moderation teams at companies like Twitter (now X) and YouTube. The project generated a huge amount of excitement, and the researchers were invited to give a demonstration at a cyberbullying summit at the White House — they just had to get the thing working.

    The day before the White House event, Dinakar spent hours trying to put together a working demo that could identify concerning posts on Twitter. Around 11 p.m., he called Jones to say he was giving up.

    Then Jones decided to look at the data. It turned out Dinakar’s model was flagging the right types of posts, but the posters were using teenage slang terms and other indirect language that Dinakar didn’t pick up on. The problem wasn’t the model; it was the disconnect between Dinakar and the teens he was trying to help.

    “We realized then, right before we got to the White House, that the people building these models should not be folks who are just machine-learning engineers,” Dinakar says. “They should be people who best understand their data.”

    The insight led the researchers to develop point-and-click tools that allow nonexperts to build machine-learning models. Those tools became the basis for Pienso, which today is helping people build large language models for detecting misinformation, human trafficking, weapons sales, and more, without writing any code.

    “These kinds of applications are important to us because our roots are in cyberbullying and understanding how to use AI for things that really help humanity,” says Jones.

    As for the early version of the system shown at the White House, the founders ended up collaborating with students at nearby schools in Cambridge, Massachusetts, to let them train the models.

    “The models those kids trained were so much better and nuanced than anything I could’ve ever come up with,” Dinakar says. “Birago and I had this big ‘Aha!’ moment where we realized empowering domain experts — which is different from democratizing AI — was the best path forward.”

    A project with purpose

    Jones and Dinakar met as graduate students in the Software Agents research group of the MIT Media Lab. Their work on what became Pienso started in Course 6.864 (Natural Language Processing) and continued until they earned their master’s degrees in 2012.

    It turned out 2010 wasn’t the last time the founders were invited to the White House to demo their project. The work generated a lot of enthusiasm, but the founders worked on Pienso part time until 2016, when Dinakar finished his PhD at MIT and deep learning began to explode in popularity.

    “We’re still connected to many people around campus,” Dinakar says. “The exposure we had at MIT, the melding of human and computer interfaces, widened our understanding. Our philosophy at Pienso couldn’t be possible without the vibrancy of MIT’s campus.”

    The founders also credit MIT’s Industrial Liaison Program (ILP) and Startup Accelerator (STEX) for connecting them to early partners.

    One early partner was SkyUK. The company’s customer success team used Pienso to build models to understand their customer’s most common problems. Today those models are helping to process half a million customer calls a day, and the founders say they have saved the company over £7 million pounds to date by shortening the length of calls into the company’s call center.

    “The difference between democratizing AI and empowering people with AI comes down to who understands the data best — you or a doctor or a journalist or someone who works with customers every day?” Jones says. “Those are the people who should be creating the models. That’s how you get insights out of your data.”

    In 2020, just as Covid-19 outbreaks began in the U.S., government officials contacted the founders to use their tool to better understand the emerging disease. Pienso helped experts in virology and infectious disease set up machine-learning models to mine thousands of research articles about coronaviruses. Dinakar says they later learned the work helped the government identify and strengthen critical supply chains for drugs, including the popular antiviral remdesivir.

    “Those compounds were surfaced by a team that did not know deep learning but was able to use our platform,” Dinakar says.

    Building a better AI future

    Because Pienso can run on internal servers and cloud infrastructure, the founders say it offers an alternative for businesses being forced to donate their data by using services offered by other AI companies.

    “The Pienso interface is a series of web apps stitched together,” Dinakar explains. “You can think of it like an Adobe Photoshop for large language models, but in the web. You can point and import data without writing a line of code. You can refine the data, prepare it for deep learning, analyze it, give it structure if it’s not labeled or annotated, and you can walk away with fine-tuned, large language model in a matter of 25 minutes.”

    Earlier this year, Pienso announced a partnership with GraphCore, which provides a faster, more efficient computing platform for machine learning. The founders say the partnership will further lower barriers to leveraging AI by dramatically reducing latency.

    “If you’re building an interactive AI platform, users aren’t going to have a cup of coffee every time they click a button,” Dinakar says. “It needs to be fast and responsive.”

    The founders believe their solution is enabling a future where more effective AI models are developed for specific use cases by the people who are most familiar with the problems they are trying to solve.

    “No one model can do everything,” Dinakar says. “Everyone’s application is different, their needs are different, their data is different. It’s highly unlikely that one model will do everything for you. It’s about bringing a garden of models together and allowing them to collaborate with each other and orchestrating them in a way that makes sense — and the people doing that orchestration should be the people who understand the data best.” More

  • in

    Automated method helps researchers quantify uncertainty in their predictions

    Pollsters trying to predict presidential election results and physicists searching for distant exoplanets have at least one thing in common: They often use a tried-and-true scientific technique called Bayesian inference.

    Bayesian inference allows these scientists to effectively estimate some unknown parameter — like the winner of an election — from data such as poll results. But Bayesian inference can be slow, sometimes consuming weeks or even months of computation time or requiring a researcher to spend hours deriving tedious equations by hand. 

    Researchers from MIT and elsewhere have introduced an optimization technique that speeds things up without requiring a scientist to do a lot of additional work. Their method can achieve more accurate results faster than another popular approach for accelerating Bayesian inference.

    Using this new automated technique, a scientist could simply input their model and then the optimization method does all the calculations under the hood to provide an approximation of some unknown parameter. The method also offers reliable uncertainty estimates that can help a researcher understand when to trust its predictions.

    This versatile technique could be applied to a wide array of scientific quandaries that incorporate Bayesian inference. For instance, it could be used by economists studying the impact of microcredit loans in developing nations or sports analysts using a model to rank top tennis players.

    “When you actually dig into what people are doing in the social sciences, physics, chemistry, or biology, they are often using a lot of the same tools under the hood. There are so many Bayesian analyses out there. If we can build a really great tool that makes these researchers lives easier, then we can really make a difference to a lot of people in many different research areas,” says senior author Tamara Broderick, an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS) and a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society.

    Broderick is joined on the paper by co-lead authors Ryan Giordano, an assistant professor of statistics at the University of California at Berkeley; and Martin Ingram, a data scientist at the AI company KONUX. The paper was recently published in the Journal of Machine Learning Research.

    Faster results

    When researchers seek a faster form of Bayesian inference, they often turn to a technique called automatic differentiation variational inference (ADVI), which is often both fast to run and easy to use.

    But Broderick and her collaborators have found a number of practical issues with ADVI. It has to solve an optimization problem and can do so only approximately. So, ADVI can still require a lot of computation time and user effort to determine whether the approximate solution is good enough. And once it arrives at a solution, it tends to provide poor uncertainty estimates.

    Rather than reinventing the wheel, the team took many ideas from ADVI but turned them around to create a technique called deterministic ADVI (DADVI) that doesn’t have these downsides.

    With DADVI, it is very clear when the optimization is finished, so a user won’t need to spend extra computation time to ensure that the best solution has been found. DADVI also permits the incorporation of more powerful optimization methods that give it an additional speed and performance boost.

    Once it reaches a result, DADVI is set up to allow the use of uncertainty corrections. These corrections make its uncertainty estimates much more accurate than those of ADVI.

    DADVI also enables the user to clearly see how much error they have incurred in the approximation to the optimization problem. This prevents a user from needlessly running the optimization again and again with more and more resources to try and reduce the error.

    “We wanted to see if we could live up to the promise of black-box inference in the sense of, once the user makes their model, they can just run Bayesian inference and don’t have to derive everything by hand, they don’t need to figure out when to stop their algorithm, and they have a sense of how accurate their approximate solution is,” Broderick says.

    Defying conventional wisdom

    DADVI can be more effective than ADVI because it uses an efficient approximation method, called sample average approximation, which estimates an unknown quantity by taking a series of exact steps.

    Because the steps along the way are exact, it is clear when the objective has been reached. Plus, getting to that objective typically requires fewer steps.

    Often, researchers expect sample average approximation to be more computationally intensive than a more popular method, known as stochastic gradient, which is used by ADVI. But Broderick and her collaborators showed that, in many applications, this is not the case.

    “A lot of problems really do have special structure, and you can be so much more efficient and get better performance by taking advantage of that special structure. That is something we have really seen in this paper,” she adds.

    They tested DADVI on a number of real-world models and datasets, including a model used by economists to evaluate the effectiveness of microcredit loans and one used in ecology to determine whether a species is present at a particular site.

    Across the board, they found that DADVI can estimate unknown parameters faster and more reliably than other methods, and achieves as good or better accuracy than ADVI. Because it is easier to use than other techniques, DADVI could offer a boost to scientists in a wide variety of fields.

    In the future, the researchers want to dig deeper into correction methods for uncertainty estimates so they can better understand why these corrections can produce such accurate uncertainties, and when they could fall short.

    “In applied statistics, we often have to use approximate algorithms for problems that are too complex or high-dimensional to allow exact solutions to be computed in reasonable time. This new paper offers an interesting set of theory and empirical results that point to an improvement in a popular existing approximate algorithm for Bayesian inference,” says Andrew Gelman ’85, ’86, a professor of statistics and political science at Columbia University, who was not involved with the study. “As one of the team involved in the creation of that earlier work, I’m happy to see our algorithm superseded by something more stable.”

    This research was supported by a National Science Foundation CAREER Award and the U.S. Office of Naval Research.  More

  • in

    MIT researchers remotely map crops, field by field

    Crop maps help scientists and policymakers track global food supplies and estimate how they might shift with climate change and growing populations. But getting accurate maps of the types of crops that are grown from farm to farm often requires on-the-ground surveys that only a handful of countries have the resources to maintain.

    Now, MIT engineers have developed a method to quickly and accurately label and map crop types without requiring in-person assessments of every single farm. The team’s method uses a combination of Google Street View images, machine learning, and satellite data to automatically determine the crops grown throughout a region, from one fraction of an acre to the next. 

    The researchers used the technique to automatically generate the first nationwide crop map of Thailand — a smallholder country where small, independent farms make up the predominant form of agriculture. The team created a border-to-border map of Thailand’s four major crops — rice, cassava, sugarcane, and maize — and determined which of the four types was grown, at every 10 meters, and without gaps, across the entire country. The resulting map achieved an accuracy of 93 percent, which the researchers say is comparable to on-the-ground mapping efforts in high-income, big-farm countries.

    The team is applying their mapping technique to other countries such as India, where small farms sustain most of the population but the type of crops grown from farm to farm has historically been poorly recorded.

    “It’s a longstanding gap in knowledge about what is grown around the world,” says Sherrie Wang, the d’Arbeloff Career Development Assistant Professor in MIT’s Department of Mechanical Engineering, and the Institute for Data, Systems, and Society (IDSS). “The final goal is to understand agricultural outcomes like yield, and how to farm more sustainably. One of the key preliminary steps is to map what is even being grown — the more granularly you can map, the more questions you can answer.”

    Wang, along with MIT graduate student Jordi Laguarta Soler and Thomas Friedel of the agtech company PEAT GmbH, will present a paper detailing their mapping method later this month at the AAAI Conference on Artificial Intelligence.

    Ground truth

    Smallholder farms are often run by a single family or farmer, who subsist on the crops and livestock that they raise. It’s estimated that smallholder farms support two-thirds of the world’s rural population and produce 80 percent of the world’s food. Keeping tabs on what is grown and where is essential to tracking and forecasting food supplies around the world. But the majority of these small farms are in low to middle-income countries, where few resources are devoted to keeping track of individual farms’ crop types and yields.

    Crop mapping efforts are mainly carried out in high-income regions such as the United States and Europe, where government agricultural agencies oversee crop surveys and send assessors to farms to label crops from field to field. These “ground truth” labels are then fed into machine-learning models that make connections between the ground labels of actual crops and satellite signals of the same fields. They then label and map wider swaths of farmland that assessors don’t cover but that satellites automatically do.

    “What’s lacking in low- and middle-income countries is this ground label that we can associate with satellite signals,” Laguarta Soler says. “Getting these ground truths to train a model in the first place has been limited in most of the world.”

    The team realized that, while many developing countries do not have the resources to maintain crop surveys, they could potentially use another source of ground data: roadside imagery, captured by services such as Google Street View and Mapillary, which send cars throughout a region to take continuous 360-degree images with dashcams and rooftop cameras.

    In recent years, such services have been able to access low- and middle-income countries. While the goal of these services is not specifically to capture images of crops, the MIT team saw that they could search the roadside images to identify crops.

    Cropped image

    In their new study, the researchers worked with Google Street View (GSV) images taken throughout Thailand — a country that the service has recently imaged fairly thoroughly, and which consists predominantly of smallholder farms.

    Starting with over 200,000 GSV images randomly sampled across Thailand, the team filtered out images that depicted buildings, trees, and general vegetation. About 81,000 images were crop-related. They set aside 2,000 of these, which they sent to an agronomist, who determined and labeled each crop type by eye. They then trained a convolutional neural network to automatically generate crop labels for the other 79,000 images, using various training methods, including iNaturalist — a web-based crowdsourced  biodiversity database, and GPT-4V, a “multimodal large language model” that enables a user to input an image and ask the model to identify what the image is depicting. For each of the 81,000 images, the model generated a label of one of four crops that the image was likely depicting — rice, maize, sugarcane, or cassava.

    The researchers then paired each labeled image with the corresponding satellite data taken of the same location throughout a single growing season. These satellite data include measurements across multiple wavelengths, such as a location’s greenness and its reflectivity (which can be a sign of water). 

    “Each type of crop has a certain signature across these different bands, which changes throughout a growing season,” Laguarta Soler notes.

    The team trained a second model to make associations between a location’s satellite data and its corresponding crop label. They then used this model to process satellite data taken of the rest of the country, where crop labels were not generated or available. From the associations that the model learned, it then assigned crop labels across Thailand, generating a country-wide map of crop types, at a resolution of 10 square meters.

    This first-of-its-kind crop map included locations corresponding to the 2,000 GSV images that the researchers originally set aside, that were labeled by arborists. These human-labeled images were used to validate the map’s labels, and when the team looked to see whether the map’s labels matched the expert, “gold standard” labels, it did so 93 percent of the time.

    “In the U.S., we’re also looking at over 90 percent accuracy, whereas with previous work in India, we’ve only seen 75 percent because ground labels are limited,” Wang says. “Now we can create these labels in a cheap and automated way.”

    The researchers are moving to map crops across India, where roadside images via Google Street View and other services have recently become available.

    “There are over 150 million smallholder farmers in India,” Wang says. “India is covered in agriculture, almost wall-to-wall farms, but very small farms, and historically it’s been very difficult to create maps of India because there are very sparse ground labels.”

    The team is working to generate crop maps in India, which could be used to inform policies having to do with assessing and bolstering yields, as global temperatures and populations rise.

    “What would be interesting would be to create these maps over time,” Wang says. “Then you could start to see trends, and we can try to relate those things to anything like changes in climate and policies.” More

  • in

    Study: Global deforestation leads to more mercury pollution

    About 10 percent of human-made mercury emissions into the atmosphere each year are the result of global deforestation, according to a new MIT study.

    The world’s vegetation, from the Amazon rainforest to the savannahs of sub-Saharan Africa, acts as a sink that removes the toxic pollutant from the air. However, if the current rate of deforestation remains unchanged or accelerates, the researchers estimate that net mercury emissions will keep increasing.

    “We’ve been overlooking a significant source of mercury, especially in tropical regions,” says Ari Feinberg, a former postdoc in the Institute for Data, Systems, and Society (IDSS) and lead author of the study.

    The researchers’ model shows that the Amazon rainforest plays a particularly important role as a mercury sink, contributing about 30 percent of the global land sink. Curbing Amazon deforestation could thus have a substantial impact on reducing mercury pollution.

    The team also estimates that global reforestation efforts could increase annual mercury uptake by about 5 percent. While this is significant, the researchers emphasize that reforestation alone should not be a substitute for worldwide pollution control efforts.

    “Countries have put a lot of effort into reducing mercury emissions, especially northern industrialized countries, and for very good reason. But 10 percent of the global anthropogenic source is substantial, and there is a potential for that to be even greater in the future. [Addressing these deforestation-related emissions] needs to be part of the solution,” says senior author Noelle Selin, a professor in IDSS and MIT’s Department of Earth, Atmospheric and Planetary Sciences.

    Feinberg and Selin are joined on the paper by co-authors Martin Jiskra, a former Swiss National Science Foundation Ambizione Fellow at the University of Basel; Pasquale Borrelli, a professor at Roma Tre University in Italy; and Jagannath Biswakarma, a postdoc at the Swiss Federal Institute of Aquatic Science and Technology. The paper appears today in Environmental Science and Technology.

    Modeling mercury

    Over the past few decades, scientists have generally focused on studying deforestation as a source of global carbon dioxide emissions. Mercury, a trace element, hasn’t received the same attention, partly because the terrestrial biosphere’s role in the global mercury cycle has only recently been better quantified.

    Plant leaves take up mercury from the atmosphere, in a similar way as they take up carbon dioxide. But unlike carbon dioxide, mercury doesn’t play an essential biological function for plants. Mercury largely stays within a leaf until it falls to the forest floor, where the mercury is absorbed by the soil.

    Mercury becomes a serious concern for humans if it ends up in water bodies, where it can become methylated by microorganisms. Methylmercury, a potent neurotoxin, can be taken up by fish and bioaccumulated through the food chain. This can lead to risky levels of methylmercury in the fish humans eat.

    “In soils, mercury is much more tightly bound than it would be if it were deposited in the ocean. The forests are doing a sort of ecosystem service, in that they are sequestering mercury for longer timescales,” says Feinberg, who is now a postdoc in the Blas Cabrera Institute of Physical Chemistry in Spain.

    In this way, forests reduce the amount of toxic methylmercury in oceans.

    Many studies of mercury focus on industrial sources, like burning fossil fuels, small-scale gold mining, and metal smelting. A global treaty, the 2013 Minamata Convention, calls on nations to reduce human-made emissions. However, it doesn’t directly consider impacts of deforestation.

    The researchers launched their study to fill in that missing piece.

    In past work, they had built a model to probe the role vegetation plays in mercury uptake. Using a series of land use change scenarios, they adjusted the model to quantify the role of deforestation.

    Evaluating emissions

    This chemical transport model tracks mercury from its emissions sources to where it is chemically transformed in the atmosphere and then ultimately to where it is deposited, mainly through rainfall or uptake into forest ecosystems.

    They divided the Earth into eight regions and performed simulations to calculate deforestation emissions factors for each, considering elements like type and density of vegetation, mercury content in soils, and historical land use.

    However, good data for some regions were hard to come by.

    They lacked measurements from tropical Africa or Southeast Asia — two areas that experience heavy deforestation. To get around this gap, they used simpler, offline models to simulate hundreds of scenarios, which helped them improve their estimations of potential uncertainties.

    They also developed a new formulation for mercury emissions from soil. This formulation captures the fact that deforestation reduces leaf area, which increases the amount of sunlight that hits the ground and accelerates the outgassing of mercury from soils.

    The model divides the world into grid squares, each of which is a few hundred square kilometers. By changing land surface and vegetation parameters in certain squares to represent deforestation and reforestation scenarios, the researchers can capture impacts on the mercury cycle.

    Overall, they found that about 200 tons of mercury are emitted to the atmosphere as the result of deforestation, or about 10 percent of total human-made emissions. But in tropical and sub-tropical countries, deforestation emissions represent a higher percentage of total emissions. For example, in Brazil deforestation emissions are 40 percent of total human-made emissions.

    In addition, people often light fires to prepare tropical forested areas for agricultural activities, which causes more emissions by releasing mercury stored by vegetation.

    “If deforestation was a country, it would be the second highest emitting country, after China, which emits around 500 tons of mercury a year,” Feinberg adds.

    And since the Minamata Convention is now addressing primary mercury emissions, scientists can expect deforestation to become a larger fraction of human-made emissions in the future.

    “Policies to protect forests or cut them down have unintended effects beyond their target. It is important to consider the fact that these are systems, and they involve human activities, and we need to understand them better in order to actually solve the problems that we know are out there,” Selin says.

    By providing this first estimate, the team hopes to inspire more research in this area.

    In the future, they want to incorporate more dynamic Earth system models into their analysis, which would enable them to interactively track mercury uptake and better model the timescale of vegetation regrowth.

    “This paper represents an important advance in our understanding of global mercury cycling by quantifying a pathway that has long been suggested but not yet quantified. Much of our research to date has focused on primary anthropogenic emissions — those directly resulting from human activity via coal combustion or mercury-gold amalgam burning in artisanal and small-scale gold mining,” says Jackie Gerson, an assistant professor in the Department of Earth and Environmental Sciences at Michigan State University, who was not involved with this research. “This research shows that deforestation can also result in substantial mercury emissions and needs to be considered both in terms of global mercury models and land management policies. It therefore has the potential to advance our field scientifically as well as to promote policies that reduce mercury emissions via deforestation.

    This work was funded, in part, by the U.S. National Science Foundation, the Swiss National Science Foundation, and Swiss Federal Institute of Aquatic Science and Technology. More

  • in

    Six MIT students selected as spring 2024 MIT-Pillar AI Collective Fellows

    The MIT-Pillar AI Collective has announced six fellows for the spring 2024 semester. With support from the program, the graduate students, who are in their final year of a master’s or PhD program, will conduct research in the areas of AI, machine learning, and data science with the aim of commercializing their innovations.

    Launched by MIT’s School of Engineering and Pillar VC in 2022, the MIT-Pillar AI Collective supports faculty, postdocs, and students conducting research on AI, machine learning, and data science. Supported by a gift from Pillar VC and administered by the MIT Deshpande Center for Technological Innovation, the mission of the program is to advance research toward commercialization.

    The spring 2024 MIT-Pillar AI Collective Fellows are:

    Yasmeen AlFaraj

    Yasmeen AlFaraj is a PhD candidate in chemistry whose interest is in the application of data science and machine learning to soft materials design to enable next-generation, sustainable plastics, rubber, and composite materials. More specifically, she is applying machine learning to the design of novel molecular additives to enable the low-cost manufacturing of chemically deconstructable thermosets and composites. AlFaraj’s work has led to the discovery of scalable, translatable new materials that could address thermoset plastic waste. As a Pillar Fellow, she will pursue bringing this technology to market, initially focusing on wind turbine blade manufacturing and conformal coatings. Through the Deshpande Center for Technological Innovation, AlFaraj serves as a lead for a team developing a spinout focused on recyclable versions of existing high-performance thermosets by incorporating small quantities of a degradable co-monomer. In addition, she participated in the National Science Foundation Innovation Corps program and recently graduated from the Clean Tech Open, where she focused on enhancing her business plan, analyzing potential markets, ensuring a complete IP portfolio, and connecting with potential funders. AlFaraj earned a BS in chemistry from University of California at Berkeley.

    Ruben Castro Ornelas

    Ruben Castro Ornelas is a PhD student in mechanical engineering who is passionate about the future of multipurpose robots and designing the hardware to use them with AI control solutions. Combining his expertise in programming, embedded systems, machine design, reinforcement learning, and AI, he designed a dexterous robotic hand capable of carrying out useful everyday tasks without sacrificing size, durability, complexity, or simulatability. Ornelas’s innovative design holds significant commercial potential in domestic, industrial, and health-care applications because it could be adapted to hold everything from kitchenware to delicate objects. As a Pillar Fellow, he will focus on identifying potential commercial markets, determining the optimal approach for business-to-business sales, and identifying critical advisors. Ornelas served as co-director of StartLabs, an undergraduate entrepreneurship club at MIT, where he earned an BS in mechanical engineering.

    Keeley Erhardt

    Keeley Erhardt is a PhD candidate in media arts and sciences whose research interests lie in the transformative potential of AI in network analysis, particularly for entity correlation and hidden link detection within and across domains. She has designed machine learning algorithms to identify and track temporal correlations and hidden signals in large-scale networks, uncovering online influence campaigns originating from multiple countries. She has similarly demonstrated the use of graph neural networks to identify coordinated cryptocurrency accounts by analyzing financial time series data and transaction dynamics. As a Pillar Fellow, Erhardt will pursue the potential commercial applications of her work, such as detecting fraud, propaganda, money laundering, and other covert activity in the finance, energy, and national security sectors. She has had internships at Google, Facebook, and Apple and held software engineering roles at multiple tech unicorns. Erhardt earned an MEng in electrical engineering and computer science and a BS in computer science, both from MIT.

    Vineet Jagadeesan Nair

    Vineet Jagadeesan Nair is a PhD candidate in mechanical engineering whose research focuses on modeling power grids and designing electricity markets to integrate renewables, batteries, and electric vehicles. He is broadly interested in developing computational tools to tackle climate change. As a Pillar Fellow, Nair will explore the application of machine learning and data science to power systems. Specifically, he will experiment with approaches to improve the accuracy of forecasting electricity demand and supply with high spatial-temporal resolution. In collaboration with Project Tapestry @ Google X, he is also working on fusing physics-informed machine learning with conventional numerical methods to increase the speed and accuracy of high-fidelity simulations. Nair’s work could help realize future grids with high penetrations of renewables and other clean, distributed energy resources. Outside academics, Nair is active in entrepreneurship, most recently helping to organize the 2023 MIT Global Startup Workshop in Greece. He earned an MS in computational science and engineering from MIT, an MPhil in energy technologies from Cambridge University as a Gates Scholar, and a BS in mechanical engineering and a BA in economics from University of California at Berkeley.

    Mahdi Ramadan

    Mahdi Ramadan is a PhD candidate in brain and cognitive sciences whose research interests lie at the intersection of cognitive science, computational modeling, and neural technologies. His work uses novel unsupervised methods for learning and generating interpretable representations of neural dynamics, capitalizing on recent advances in AI, specifically contrastive and geometric deep learning techniques capable of uncovering the latent dynamics underlying neural processes with high fidelity. As a Pillar Fellow, he will leverage these methods to gain a better understanding of dynamical models of muscle signals for generative motor control. By supplementing current spinal prosthetics with generative AI motor models that can streamline, speed up, and correct limb muscle activations in real time, as well as potentially using multimodal vision-language models to infer the patients’ high-level intentions, Ramadan aspires to build truly scalable, accessible, and capable commercial neuroprosthetics. Ramadan’s entrepreneurial experience includes being the co-founder of UltraNeuro, a neurotechnology startup, and co-founder of Presizely, a computer vision startup. He earned a BS in neurobiology from University of Washington.

    Rui (Raymond) Zhou

    Rui (Raymond) Zhou is a PhD candidate in mechanical engineering whose research focuses on multimodal AI for engineering design. As a Pillar Fellow, he will advance models that could enable designers to translate information in any modality or combination of modalities into comprehensive 2D and 3D designs, including parametric data, component visuals, assembly graphs, and sketches. These models could also optimize existing human designs to accomplish goals such as improving ergonomics or reducing drag coefficient. Ultimately, Zhou aims to translate his work into a software-as-a-service platform that redefines product design across various sectors, from automotive to consumer electronics. His efforts have the potential to not only accelerate the design process but also reduce costs, opening the door to unprecedented levels of customization, idea generation, and rapid prototyping. Beyond his academic pursuits, Zhou founded UrsaTech, a startup that integrates AI into education and engineering design. He earned a BS in electrical engineering and computer sciences from University of California at Berkeley. More

  • in

    How symmetry can come to the aid of machine learning

    Behrooz Tahmasebi — an MIT PhD student in the Department of Electrical Engineering and Computer Science (EECS) and an affiliate of the Computer Science and Artificial Intelligence Laboratory (CSAIL) — was taking a mathematics course on differential equations in late 2021 when a glimmer of inspiration struck. In that class, he learned for the first time about Weyl’s law, which had been formulated 110 years earlier by the German mathematician Hermann Weyl. Tahmasebi realized it might have some relevance to the computer science problem he was then wrestling with, even though the connection appeared — on the surface — to be thin, at best. Weyl’s law, he says, provides a formula that measures the complexity of the spectral information, or data, contained within the fundamental frequencies of a drum head or guitar string.

    Tahmasebi was, at the same time, thinking about measuring the complexity of the input data to a neural network, wondering whether that complexity could be reduced by taking into account some of the symmetries inherent to the dataset. Such a reduction, in turn, could facilitate — as well as speed up — machine learning processes.

    Weyl’s law, conceived about a century before the boom in machine learning, had traditionally been applied to very different physical situations — such as those concerning the vibrations of a string or the spectrum of electromagnetic (black-body) radiation given off by a heated object. Nevertheless, Tahmasebi believed that a customized version of that law might help with the machine learning problem he was pursuing. And if the approach panned out, the payoff could be considerable.

    He spoke with his advisor, Stefanie Jegelka — an associate professor in EECS and affiliate of CSAIL and the MIT Institute for Data, Systems, and Society — who believed the idea was definitely worth looking into. As Tahmasebi saw it, Weyl’s law had to do with gauging the complexity of data, and so did this project. But Weyl’s law, in its original form, said nothing about symmetry.

    He and Jegelka have now succeeded in modifying Weyl’s law so that symmetry can be factored into the assessment of a dataset’s complexity. “To the best of my knowledge,” Tahmasebi says, “this is the first time Weyl’s law has been used to determine how machine learning can be enhanced by symmetry.”

    The paper he and Jegelka wrote earned a “Spotlight” designation when it was presented at the December 2023 conference on Neural Information Processing Systems — widely regarded as the world’s top conference on machine learning.

    This work, comments Soledad Villar, an applied mathematician at Johns Hopkins University, “shows that models that satisfy the symmetries of the problem are not only correct but also can produce predictions with smaller errors, using a small amount of training points. [This] is especially important in scientific domains, like computational chemistry, where training data can be scarce.”

    In their paper, Tahmasebi and Jegelka explored the ways in which symmetries, or so-called “invariances,” could benefit machine learning. Suppose, for example, the goal of a particular computer run is to pick out every image that contains the numeral 3. That task can be a lot easier, and go a lot quicker, if the algorithm can identify the 3 regardless of where it is placed in the box — whether it’s exactly in the center or off to the side — and whether it is pointed right-side up, upside down, or oriented at a random angle. An algorithm equipped with the latter capability can take advantage of the symmetries of translation and rotations, meaning that a 3, or any other object, is not changed in itself by altering its position or by rotating it around an arbitrary axis. It is said to be invariant to those shifts. The same logic can be applied to algorithms charged with identifying dogs or cats. A dog is a dog is a dog, one might say, irrespective of how it is embedded within an image. 

    The point of the entire exercise, the authors explain, is to exploit a dataset’s intrinsic symmetries in order to reduce the complexity of machine learning tasks. That, in turn, can lead to a reduction in the amount of data needed for learning. Concretely, the new work answers the question: How many fewer data are needed to train a machine learning model if the data contain symmetries?

    There are two ways of achieving a gain, or benefit, by capitalizing on the symmetries present. The first has to do with the size of the sample to be looked at. Let’s imagine that you are charged, for instance, with analyzing an image that has mirror symmetry — the right side being an exact replica, or mirror image, of the left. In that case, you don’t have to look at every pixel; you can get all the information you need from half of the image — a factor of two improvement. If, on the other hand, the image can be partitioned into 10 identical parts, you can get a factor of 10 improvement. This kind of boosting effect is linear.

    To take another example, imagine you are sifting through a dataset, trying to find sequences of blocks that have seven different colors — black, blue, green, purple, red, white, and yellow. Your job becomes much easier if you don’t care about the order in which the blocks are arranged. If the order mattered, there would be 5,040 different combinations to look for. But if all you care about are sequences of blocks in which all seven colors appear, then you have reduced the number of things — or sequences — you are searching for from 5,040 to just one.

    Tahmasebi and Jegelka discovered that it is possible to achieve a different kind of gain — one that is exponential — that can be reaped for symmetries that operate over many dimensions. This advantage is related to the notion that the complexity of a learning task grows exponentially with the dimensionality of the data space. Making use of a multidimensional symmetry can therefore yield a disproportionately large return. “This is a new contribution that is basically telling us that symmetries of higher dimension are more important because they can give us an exponential gain,” Tahmasebi says. 

    The NeurIPS 2023 paper that he wrote with Jegelka contains two theorems that were proved mathematically. “The first theorem shows that an improvement in sample complexity is achievable with the general algorithm we provide,” Tahmasebi says. The second theorem complements the first, he added, “showing that this is the best possible gain you can get; nothing else is achievable.”

    He and Jegelka have provided a formula that predicts the gain one can obtain from a particular symmetry in a given application. A virtue of this formula is its generality, Tahmasebi notes. “It works for any symmetry and any input space.” It works not only for symmetries that are known today, but it could also be applied in the future to symmetries that are yet to be discovered. The latter prospect is not too farfetched to consider, given that the search for new symmetries has long been a major thrust in physics. That suggests that, as more symmetries are found, the methodology introduced by Tahmasebi and Jegelka should only get better over time.

    According to Haggai Maron, a computer scientist at Technion (the Israel Institute of Technology) and NVIDIA who was not involved in the work, the approach presented in the paper “diverges substantially from related previous works, adopting a geometric perspective and employing tools from differential geometry. This theoretical contribution lends mathematical support to the emerging subfield of ‘Geometric Deep Learning,’ which has applications in graph learning, 3D data, and more. The paper helps establish a theoretical basis to guide further developments in this rapidly expanding research area.” More

  • in

    Creating new skills and new connections with MIT’s Quantitative Methods Workshop

    Starting on New Year’s Day, when many people were still clinging to holiday revelry, scores of students and faculty members from about a dozen partner universities instead flipped open their laptops for MIT’s Quantitative Methods Workshop, a jam-packed, weeklong introduction to how computational and mathematical techniques can be applied to neuroscience and biology research. But don’t think of QMW as a “crash course.” Instead the program’s purpose is to help elevate each participant’s scientific outlook, both through the skills and concepts it imparts and the community it creates.

    “It broadens their horizons, it shows them significant applications they’ve never thought of, and introduces them to people whom as researchers they will come to know and perhaps collaborate with one day,” says Susan L. Epstein, a Hunter College computer science professor and education coordinator of MIT’s Center for Brains, Minds, and Machines, which hosts the program with the departments of Biology and Brain and Cognitive Sciences and The Picower Institute for Learning and Memory. “It is a model of interdisciplinary scholarship.”

    This year 83 undergraduates and faculty members from institutions that primarily serve groups underrepresented in STEM fields took part in the QMW, says organizer Mandana Sassanfar, senior lecturer and director of diversity and science outreach across the four hosting MIT entities. Since the workshop launched in 2010, it has engaged more than 1,000 participants, of whom more than 170 have gone on to participate in MIT Summer Research Programs (such as MSRP-BIO), and 39 have come to MIT for graduate school.

    Individual goals, shared experience

    Undergraduates and faculty in various STEM disciplines often come to QMW to gain an understanding of, or expand their expertise in, computational and mathematical data analysis. Computer science- and statistics-minded participants come to learn more about how such techniques can be applied in life sciences fields. In lectures; in hands-on labs where they used the computer programming language Python to process, analyze, and visualize data; and in less formal settings such as tours and lunches with MIT faculty, participants worked and learned together, and informed each other’s perspectives.

    Brain and Cognitive Sciences Professor Nancy Kanwisher delivers a lecture in MIT’s Building 46 on functional brain imaging to QMW participants.

    Photo: Mandana Sassanfar

    Previous item
    Next item

    And regardless of their field of study, participants made connections with each other and with the MIT students and faculty who taught and spoke over the course of the week.

    Hunter College computer science sophomore Vlad Vostrikov says that while he has already worked with machine learning and other programming concepts, he was interested to “branch out” by seeing how they are used to analyze scientific datasets. He also valued the chance to learn the experiences of the graduate students who teach QMW’s hands-on labs.

    “This was a good way to explore computational biology and neuroscience,” Vostrikov says. “I also really enjoy hearing from the people who teach us. It’s interesting to hear where they come from and what they are doing.”

    Jariatu Kargbo, a biology and chemistry sophomore at University of Maryland Baltimore County, says when she first learned of the QMW she wasn’t sure it was for her. It seemed very computation-focused. But her advisor Holly Willoughby encouraged Kargbo to attend to learn about how programming could be useful in future research — currently she is taking part in research on the retina at UMBC. More than that, Kargbo also realized it would be a good opportunity to make connections at MIT in advance of perhaps applying for MSRP this summer.

    “I thought this would be a great way to meet up with faculty and see what the environment is like here because I’ve never been to MIT before,” Kargbo says. “It’s always good to meet other people in your field and grow your network.”

    QMW is not just for students. It’s also for their professors, who said they can gain valuable professional education for their research and teaching.

    Fayuan Wen, an assistant professor of biology at Howard University, is no stranger to computational biology, having performed big data genetic analyses of sickle cell disease (SCD). But she’s mostly worked with the R programming language and QMW’s focus is on Python. As she looks ahead to projects in which she wants analyze genomic data to help predict disease outcomes in SCD and HIV, she says a QMW session delivered by biology graduate student Hannah Jacobs was perfectly on point.

    “This workshop has the skills I want to have,” Wen says.

    Moreover, Wen says she is looking to start a machine-learning class in the Howard biology department and was inspired by some of the teaching materials she encountered at QMW — for example, online curriculum modules developed by Taylor Baum, an MIT graduate student in electrical engineering and computer science and Picower Institute labs, and Paloma Sánchez-Jáuregui, a coordinator who works with Sassanfar.

    Tiziana Ligorio, a Hunter College computer science doctoral lecturer who together with Epstein teaches a deep machine-learning class at the City University of New York campus, felt similarly. Rather than require a bunch of prerequisites that might drive students away from the class, Ligorio was looking to QMW’s intense but introductory curriculum as a resource for designing a more inclusive way of getting students ready for the class.

    Instructive interactions

    Each day runs from 9 a.m. to 5 p.m., including morning and afternoon lectures and hands-on sessions. Class topics ranged from statistical data analysis and machine learning to brain-computer interfaces, brain imaging, signal processing of neural activity data, and cryogenic electron microscopy.

    “This workshop could not happen without dedicated instructors — grad students, postdocs, and faculty — who volunteer to give lectures, design and teach hands-on computer labs, and meet with students during the very first week of January,” Saassanfar says.

    MIT assistant professor of biology Brady Weissbourd (center) converses with QMW student participants during a lunch break.

    Photo: Mandana Sassanfar

    Previous item
    Next item

    The sessions surround student lunches with MIT faculty members. For example, at midday Jan. 2, assistant professor of biology Brady Weissbourd, an investigator in the Picower Institute, sat down with seven students in one of Building 46’s curved sofas to field questions about his neuroscience research in jellyfish and how he uses quantitative techniques as part of that work. He also described what it’s like to be a professor, and other topics that came to the students’ minds.

    Then the participants all crossed Vassar Street to Building 26’s Room 152, where they formed different but similarly sized groups for the hands-on lab “Machine learning applications to studying the brain,” taught by Baum. She guided the class through Python exercises she developed illustrating “supervised” and “unsupervised” forms of machine learning, including how the latter method can be used to discern what a person is seeing based on magnetic readings of brain activity.

    As students worked through the exercises, tablemates helped each other by supplementing Baum’s instruction. Ligorio, Vostrikov, and Kayla Blincow, assistant professor of biology at the University of the Virgin Islands, for instance, all leapt to their feet to help at their tables.

    Hunter College lecturer of computer science Tiziana Ligorio (standing) explains a Python programming concept to students at her table during a workshop session.

    Photo: David Orenstein

    Previous item
    Next item

    At the end of the class, when Baum asked students what they had learned, they offered a litany of new knowledge. Survey data that Sassanfar and Sánchez-Jáuregui use to anonymously track QMW outcomes, revealed many more such attestations of the value of the sessions. With a prompt asking how one might apply what they’ve learned, one respondent wrote: “Pursue a research career or endeavor in which I apply the concepts of computer science and neuroscience together.”

    Enduring connections

    While some new QMW attendees might only be able to speculate about how they’ll apply their new skills and relationships, Luis Miguel de Jesús Astacio could testify to how attending QMW as an undergraduate back in 2014 figured into a career where he is now a faculty member in physics at the University of Puerto Rico Rio Piedras Campus. After QMW, he returned to MIT that summer as a student in the lab of neuroscientist and Picower Professor Susumu Tonegawa. He came back again in 2016 to the lab of physicist and Francis Friedman Professor Mehran Kardar. What’s endured for the decade has been his connection to Sassanfar. So while he was once a student at QMW, this year he was back with a cohort of undergraduates as a faculty member.

    Michael Aldarondo-Jeffries, director of academic advancement programs at the University of Central Florida, seconded the value of the networking that takes place at QMW. He has brought students for a decade, including four this year. What he’s observed is that as students come together in settings like QMW or UCF’s McNair program, which helps to prepare students for graduate school, they become inspired about a potential future as researchers.

    “The thing that stands out is just the community that’s formed,” he says. “For many of the students, it’s the first time that they’re in a group that understands what they’re moving toward. They don’t have to explain why they’re excited to read papers on a Friday night.”

    Or why they are excited to spend a week including New Year’s Day at MIT learning how to apply quantitative methods to life sciences data. More

  • in

    Generating the policy of tomorrow

    As first-year students in the Social and Engineering Systems (SES) doctoral program within the MIT Institute for Data, Systems, and Society (IDSS), Eric Liu and Ashely Peake share an interest in investigating housing inequality issues.

    They also share a desire to dive head-first into their research.

    “In the first year of your PhD, you’re taking classes and still getting adjusted, but we came in very eager to start doing research,” Liu says.

    Liu, Peake, and many others found an opportunity to do hands-on research on real-world problems at the MIT Policy Hackathon, an initiative organized by students in IDSS, including the Technology and Policy Program (TPP). The weekend-long, interdisciplinary event — now in its sixth year — continues to gather hundreds of participants from around the globe to explore potential solutions to some of society’s greatest challenges.

    This year’s theme, “Hack-GPT: Generating the Policy of Tomorrow,” sought to capitalize on the popularity of generative AI (like the chatbot ChatGPT) and the ways it is changing how we think about technical and policy-based challenges, according to Dansil Green, a second-year TPP master’s student and co-chair of the event.

    “We encouraged our teams to utilize and cite these tools, thinking about the implications that generative AI tools have on their different challenge categories,” Green says.

    After 2022’s hybrid event, this year’s organizers pivoted back to a virtual-only approach, allowing them to increase the overall number of participants in addition to increasing the number of teams per challenge by 20 percent.

    “Virtual allows you to reach more people — we had a high number of international participants this year — and it helps reduce some of the costs,” Green says. “I think going forward we are going to try and switch back and forth between virtual and in-person because there are different benefits to each.”

    “When the magic hits”

    Liu and Peake competed in the housing challenge category, where they could gain research experience in their actual field of study. 

    “While I am doing housing research, I haven’t necessarily had a lot of opportunities to work with actual housing data before,” says Peake, who recently joined the SES doctoral program after completing an undergraduate degree in applied math last year. “It was a really good experience to get involved with an actual data problem, working closer with Eric, who’s also in my lab group, in addition to meeting people from MIT and around the world who are interested in tackling similar questions and seeing how they think about things differently.”

    Joined by Adrian Butterton, a Boston-based paralegal, as well as Hudson Yuen and Ian Chan, two software engineers from Canada, Liu and Peake formed what would end up being the winning team in their category: “Team Ctrl+Alt+Defeat.” They quickly began organizing a plan to address the eviction crisis in the United States.

    “I think we were kind of surprised by the scope of the question,” Peake laughs. “In the end, I think having such a large scope motivated us to think about it in a more realistic kind of way — how could we come up with a solution that was adaptable and therefore could be replicated to tackle different kinds of problems.”

    Watching the challenge on the livestream together on campus, Liu says they immediately went to work, and could not believe how quickly things came together.

    “We got our challenge description in the evening, came out to the purple common area in the IDSS building and literally it took maybe an hour and we drafted up the entire project from start to finish,” Liu says. “Then our software engineer partners had a dashboard built by 1 a.m. — I feel like the hackathon really promotes that really fast dynamic work stream.”

    “People always talk about the grind or applying for funding — but when that magic hits, it just reminds you of the part of research that people don’t talk about, and it was really a great experience to have,” Liu adds.

    A fresh perspective

    “We’ve organized hackathons internally at our company and they are great for fostering innovation and creativity,” says Letizia Bordoli, senior AI product manager at Veridos, a German-based identity solutions company that provided this year’s challenge in Data Systems for Human Rights. “It is a great opportunity to connect with talented individuals and explore new ideas and solutions that we might not have thought about.”

    The challenge provided by Veridos was focused on finding innovative solutions to universal birth registration, something Bordoli says only benefited from the fact that the hackathon participants were from all over the world.

    “Many had local and firsthand knowledge about certain realities and challenges [posed by the lack of] birth registration,” Bordoli says. “It brings fresh perspectives to existing challenges, and it gave us an energy boost to try to bring innovative solutions that we may not have considered before.”

    New frontiers

    Alongside the housing and data systems for human rights challenges was a challenge in health, as well as a first-time opportunity to tackle an aerospace challenge in the area of space for environmental justice.

    “Space can be a very hard challenge category to do data-wise since a lot of data is proprietary, so this really developed over the last few months with us having to think about how we could do more with open-source data,” Green explains. “But I am glad we went the environmental route because it opened the challenge up to not only space enthusiasts, but also environment and climate people.”

    One of the participants to tackle this new challenge category was Yassine Elhallaoui, a system test engineer from Norway who specializes in AI solutions and has 16 years of experience working in the oil and gas fields. Elhallaoui was a member of Team EcoEquity, which proposed an increase in policies supporting the use of satellite data to ensure proper evaluation and increase water resiliency for vulnerable communities.

    “The hackathons I have participated in in the past were more technical,” Elhallaoui says. “Starting with [MIT Science and Technology Policy Institute Director Kristen Kulinowski’s] workshop about policy writers and the solutions they came up with, and the analysis they had to do … it really changed my perspective on what a hackathon can do.”

    “A policy hackathon is something that can make real changes in the world,” she adds. More