More stories

  • in

    When to trust an AI model

    Because machine-learning models can give false predictions, researchers often equip them with the ability to tell a user how confident they are about a certain decision. This is especially important in high-stake settings, such as when models are used to help identify disease in medical images or filter job applications.But a model’s uncertainty quantifications are only useful if they are accurate. If a model says it is 49 percent confident that a medical image shows a pleural effusion, then 49 percent of the time, the model should be right.MIT researchers have introduced a new approach that can improve uncertainty estimates in machine-learning models. Their method not only generates more accurate uncertainty estimates than other techniques, but does so more efficiently.In addition, because the technique is scalable, it can be applied to huge deep-learning models that are increasingly being deployed in health care and other safety-critical situations.This technique could give end users, many of whom lack machine-learning expertise, better information they can use to determine whether to trust a model’s predictions or if the model should be deployed for a particular task.“It is easy to see these models perform really well in scenarios where they are very good, and then assume they will be just as good in other scenarios. This makes it especially important to push this kind of work that seeks to better calibrate the uncertainty of these models to make sure they align with human notions of uncertainty,” says lead author Nathan Ng, a graduate student at the University of Toronto who is a visiting student at MIT.Ng wrote the paper with Roger Grosse, an assistant professor of computer science at the University of Toronto; and senior author Marzyeh Ghassemi, an associate professor in the Department of Electrical Engineering and Computer Science and a member of the Institute of Medical Engineering Sciences and the Laboratory for Information and Decision Systems. The research will be presented at the International Conference on Machine Learning.Quantifying uncertaintyUncertainty quantification methods often require complex statistical calculations that don’t scale well to machine-learning models with millions of parameters. These methods also require users to make assumptions about the model and data used to train it.The MIT researchers took a different approach. They use what is known as the minimum description length principle (MDL), which does not require the assumptions that can hamper the accuracy of other methods. MDL is used to better quantify and calibrate uncertainty for test points the model has been asked to label.The technique the researchers developed, known as IF-COMP, makes MDL fast enough to use with the kinds of large deep-learning models deployed in many real-world settings.MDL involves considering all possible labels a model could give a test point. If there are many alternative labels for this point that fit well, its confidence in the label it chose should decrease accordingly.“One way to understand how confident a model is would be to tell it some counterfactual information and see how likely it is to believe you,” Ng says.For example, consider a model that says a medical image shows a pleural effusion. If the researchers tell the model this image shows an edema, and it is willing to update its belief, then the model should be less confident in its original decision.With MDL, if a model is confident when it labels a datapoint, it should use a very short code to describe that point. If it is uncertain about its decision because the point could have many other labels, it uses a longer code to capture these possibilities.The amount of code used to label a datapoint is known as stochastic data complexity. If the researchers ask the model how willing it is to update its belief about a datapoint given contrary evidence, the stochastic data complexity should decrease if the model is confident.But testing each datapoint using MDL would require an enormous amount of computation.Speeding up the processWith IF-COMP, the researchers developed an approximation technique that can accurately estimate stochastic data complexity using a special function, known as an influence function. They also employed a statistical technique called temperature-scaling, which improves the calibration of the model’s outputs. This combination of influence functions and temperature-scaling enables high-quality approximations of the stochastic data complexity.In the end, IF-COMP can efficiently produce well-calibrated uncertainty quantifications that reflect a model’s true confidence. The technique can also determine whether the model has mislabeled certain data points or reveal which data points are outliers.The researchers tested their system on these three tasks and found that it was faster and more accurate than other methods.“It is really important to have some certainty that a model is well-calibrated, and there is a growing need to detect when a specific prediction doesn’t look quite right. Auditing tools are becoming more necessary in machine-learning problems as we use large amounts of unexamined data to make models that will be applied to human-facing problems,” Ghassemi says.IF-COMP is model-agnostic, so it can provide accurate uncertainty quantifications for many types of machine-learning models. This could enable it to be deployed in a wider range of real-world settings, ultimately helping more practitioners make better decisions.“People need to understand that these systems are very fallible and can make things up as they go. A model may look like it is highly confident, but there are a ton of different things it is willing to believe given evidence to the contrary,” Ng says.In the future, the researchers are interested in applying their approach to large language models and studying other potential use cases for the minimum description length principle.  More

  • in

    MIT ARCLab announces winners of inaugural Prize for AI Innovation in Space

    Satellite density in Earth’s orbit has increased exponentially in recent years, with lower costs of small satellites allowing governments, researchers, and private companies to launch and operate some 2,877 satellites into orbit in 2023 alone. This includes increased geostationary Earth orbit (GEO) satellite activity, which brings technologies with global-scale impact, from broadband internet to climate surveillance. Along with the manifold benefits of these satellite-enabled technologies, however, come increased safety and security risks, as well as environmental concerns. More accurate and efficient methods of monitoring and modeling satellite behavior are urgently needed to prevent collisions and other disasters.To address this challenge, the MIT Astrodynamics, Space Robotic, and Controls Laboratory (ARCLab) launched the MIT ARCLab Prize for AI Innovation in Space: a first-of-its-kind competition asking contestants to harness AI to characterize satellites’ patterns of life (PoLs) — the long-term behavioral narrative of a satellite in orbit — using purely passively collected information. Following the call for participants last fall, 126 teams used machine learning to create algorithms to label and time-stamp the behavioral modes of GEO satellites over a six-month period, competing for accuracy and efficiency.With support from the U.S. Department of the Air Force-MIT AI Accelerator, the challenge offers a total of $25,000. A team of judges from ARCLab and MIT Lincoln Laboratory evaluated the submissions based on clarity, novelty, technical depth, and reproducibility, assigning each entry a score out of 100 points. Now the judges have announced the winners and runners-up:First prize: David Baldsiefen — Team Hawaii2024With a winning score of 96, Baldsiefen will be awarded $10,000 and is invited to join the ARCLab team in presenting at a poster session at the Advanced Maui Optical and Space Surveillance Technologies (AMOS) Conference in Hawaii this fall. One evaluator noted, “Clear and concise report, with very good ideas such as the label encoding of the localizer. Decisions on the architectures and the feature engineering are well reasoned. The code provided is also well documented and structured, allowing an easy reproducibility of the experimentation.”Second prize: Binh Tran, Christopher Yeung, Kurtis Johnson, Nathan Metzger — Team Millennial-IUPWith a score of 94.2, Y, Millennial-IUP will be awarded $5,000 and will also join the ARCLab team at the AMOS conference. One evaluator said, “The models chosen were sensible and justified, they made impressive efforts in efficiency gains… They used physics to inform their models and this appeared to be reproducible. Overall it was an easy to follow, concise report without much jargon.”Third Prize: Isaac Haik and Francois Porcher — Team QR_IsWith a score of 94, Haik and Porcher will share the third prize of $3,000 and will also be invited to the AMOS conference with the ARCLab team. One evaluator noted, “This informative and interesting report describes the combination of ML and signal processing techniques in a compelling way, assisted by informative plots, tables, and sequence diagrams. The author identifies and describes a modular approach to class detection and their assessment of feature utility, which they correctly identify is not evenly useful across classes… Any lack of mission expertise is made up for by a clear and detailed discussion of the benefits and pitfalls of the methods they used and discussion of what they learned.”The fourth- through seventh-place scoring teams will each receive $1,000 and a certificate of excellence.“The goal of this competition was to foster an interdisciplinary approach to problem-solving in the space domain by inviting AI development experts to apply their skills in this new context of orbital capacity. And all of our winning teams really delivered — they brought technical skill, novel approaches, and expertise to a very impressive round of submissions.” says Professor Richard Linares, who heads ARCLab.Active modeling with passive dataThroughout a GEO satellite’s time in orbit, operators issue commands to place them in various behavioral modes—station-keeping, longitudinal shifts, end-of-life behaviors, and so on. Satellite Patterns of Life (PoLs) describe on-orbit behavior composed of sequences of both natural and non-natural behavior modes.ARCLab has developed a groundbreaking benchmarking tool for geosynchronous satellite pattern-of-life characterization and created the Satellite Pattern-of-Life Identification Dataset (SPLID), comprising real and synthetic space object data. The challenge participants used this tool to create algorithms that use AI to map out the on-orbit behaviors of a satellite.The goal of the MIT ARCLab Prize for AI Innovation in Space is to encourage technologists and enthusiasts to bring innovation and new skills sets to well-established challenges in aerospace. The team aims to hold the competition in 2025 and 2026 to explore other topics and invite experts in AI to apply their skills to new challenges.  More

  • in

    Community members receive 2024 MIT Excellence Awards, Collier Medal, and Staff Award for Distinction in Service

    On Wednesday, June 5, 13 individuals and four teams were awarded MIT Excellence Awards — the highest awards for staff at the Institute. Colleagues holding signs, waving pompoms, and cheering gathered in Kresge Auditorium to show their support for the honorees. In addition to the Excellence Awards, staff members were honored with the Collier Medal, the Staff Award for Distinction in Service, and the Gordon Y. Billard Award. The Collier Medal honors the memory of Officer Sean Collier, who gave his life protecting and serving MIT; it celebrates an individual or group whose actions demonstrate the importance of community. The Staff Award for Distinction in Service is presented to a staff member whose service results in a positive lasting impact on the Institute.The Gordon Y. Billard Award is given annually to staff, faculty, or an MIT-affiliated individual(s) who has given “special service of outstanding merit performed for the Institute.” This year, for the first time, this award was presented at the MIT Excellence Awards and Collier Medal celebration. The 2024 MIT Excellence Award recipients and their award categories are: Innovative Solutions Nanotechnology Material Core Staff, Koch Institute for Integrative Cancer Research, Office of the Vice President for Research (Margaret Bisher, Giovanni de Nola, David Mankus, and Dong Soo Yun)Bringing Out the Best Salvatore Ieni James Kelsey Lauren PouchakServing Our Community Megan Chester Alessandra Davy-Falconi David Randall Days Weekend Team, Department of Custodial Services, Department of Facilities: Karen Melisa Betancourth, Ana Guerra Chavarria, Yeshi Khando, Joao Pacheco, and Kevin Salazar IMES/HST Academic Office Team, Institute for Medical Engineering and Science, School of Engineering: Traci Anderson, Joseph R. Stein, and Laurie Ward Team Leriche, Department of Custodial Services, Department of Facilities: Anthony Anzalone, David Solomon Carrasco, Larrenton Forrest, Michael Leriche, and Joe VieiraEmbracing Diversity, Equity, and Inclusion Bhaskar Pant Jessica TamOutstanding Contributor Paul W. Barone Marcia G. Davidson Steven Kooi Tianjiao Lei Andrew H. Mack

    2024 MIT Excellence Awards + Collier Medal Ceremony

    The 2024 Collier Medal recipient was Benjamin B. Lewis, a graduate student in the Institute for Data, Systems and Society in the MIT Schwarzman College of Computing. Last spring, he founded the Cambridge branch of End Overdose, a nonprofit dedicated to reducing drug-related overdose deaths. Through his efforts, more than 600 members of the Greater Boston community, including many at MIT, have been trained to administer lifesaving treatment at critical moments.This year’s recipient of the 2024 Staff Award for Distinction in Service was Diego F. Arango (Department of Custodial Services, Department of Facilities), daytime custodian in Building 46. He was nominated by no fewer than 36 staff, faculty, students, and researchers for creating a positive working environment and for offering “help whenever, wherever, and to whomever needs it.”Three community members were honored with a 2024 Gordon Y. Billard AwardDeborah G. Douglas, senior director of collections and curator of science and technology, MIT MuseumRonald Hasseltine, assistant provost for research administration, Office of the Vice President for ResearchRichard K. Lester, vice provost for international activities and Japan Steel Industry Professor of Nuclear Science and Engineering, School of EngineeringPresenters included President Sally Kornbluth; MIT Chief of Police John DiFava and Deputy Chief Steven DeMarco; Vice President for Human Resources Ramona Allen; Executive Vice President and Treasurer Glen Shor; Provost Cynthia Barnhart; Lincoln Laboratory director Eric Evans; Chancellor Melissa Nobles; and Dean of the School of Engineering Anantha Chandrakasan.Visit the MIT Human Resources website for more information about the award recipients, categories, and to view photos and video of the event. More

  • in

    Study finds health risks in switching ships from diesel to ammonia fuel

    As container ships the size of city blocks cross the oceans to deliver cargo, their huge diesel engines emit large quantities of air pollutants that drive climate change and have human health impacts. It has been estimated that maritime shipping accounts for almost 3 percent of global carbon dioxide emissions and the industry’s negative impacts on air quality cause about 100,000 premature deaths each year.Decarbonizing shipping to reduce these detrimental effects is a goal of the International Maritime Organization, a U.N. agency that regulates maritime transport. One potential solution is switching the global fleet from fossil fuels to sustainable fuels such as ammonia, which could be nearly carbon-free when considering its production and use.But in a new study, an interdisciplinary team of researchers from MIT and elsewhere caution that burning ammonia for maritime fuel could worsen air quality further and lead to devastating public health impacts, unless it is adopted alongside strengthened emissions regulations.Ammonia combustion generates nitrous oxide (N2O), a greenhouse gas that is about 300 times more potent than carbon dioxide. It also emits nitrogen in the form of nitrogen oxides (NO and NO2, referred to as NOx), and unburnt ammonia may slip out, which eventually forms fine particulate matter in the atmosphere. These tiny particles can be inhaled deep into the lungs, causing health problems like heart attacks, strokes, and asthma.The new study indicates that, under current legislation, switching the global fleet to ammonia fuel could cause up to about 600,000 additional premature deaths each year. However, with stronger regulations and cleaner engine technology, the switch could lead to about 66,000 fewer premature deaths than currently caused by maritime shipping emissions, with far less impact on global warming.“Not all climate solutions are created equal. There is almost always some price to pay. We have to take a more holistic approach and consider all the costs and benefits of different climate solutions, rather than just their potential to decarbonize,” says Anthony Wong, a postdoc in the MIT Center for Global Change Science and lead author of the study.His co-authors include Noelle Selin, an MIT professor in the Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences (EAPS); Sebastian Eastham, a former principal research scientist who is now a senior lecturer at Imperial College London; Christine Mounaïm-Rouselle, a professor at the University of Orléans in France; Yiqi Zhang, a researcher at the Hong Kong University of Science and Technology; and Florian Allroggen, a research scientist in the MIT Department of Aeronautics and Astronautics. The research appears this week in Environmental Research Letters.Greener, cleaner ammoniaTraditionally, ammonia is made by stripping hydrogen from natural gas and then combining it with nitrogen at extremely high temperatures. This process is often associated with a large carbon footprint. The maritime shipping industry is betting on the development of “green ammonia,” which is produced by using renewable energy to make hydrogen via electrolysis and to generate heat.“In theory, if you are burning green ammonia in a ship engine, the carbon emissions are almost zero,” Wong says.But even the greenest ammonia generates nitrous oxide (N2O), nitrogen oxides (NOx) when combusted, and some of the ammonia may slip out, unburnt. This nitrous oxide would escape into the atmosphere, where the greenhouse gas would remain for more than 100 years. At the same time, the nitrogen emitted as NOx and ammonia would fall to Earth, damaging fragile ecosystems. As these emissions are digested by bacteria, additional N2O  is produced.NOx and ammonia also mix with gases in the air to form fine particulate matter. A primary contributor to air pollution, fine particulate matter kills an estimated 4 million people each year.“Saying that ammonia is a ‘clean’ fuel is a bit of an overstretch. Just because it is carbon-free doesn’t necessarily mean it is clean and good for public health,” Wong says.A multifaceted modelThe researchers wanted to paint the whole picture, capturing the environmental and public health impacts of switching the global fleet to ammonia fuel. To do so, they designed scenarios to measure how pollutant impacts change under certain technology and policy assumptions.From a technological point of view, they considered two ship engines. The first burns pure ammonia, which generates higher levels of unburnt ammonia but emits fewer nitrogen oxides. The second engine technology involves mixing ammonia with hydrogen to improve combustion and optimize the performance of a catalytic converter, which controls both nitrogen oxides and unburnt ammonia pollution.They also considered three policy scenarios: current regulations, which only limit NOx emissions in some parts of the world; a scenario that adds ammonia emission limits over North America and Western Europe; and a scenario that adds global limits on ammonia and NOx emissions.The researchers used a ship track model to calculate how pollutant emissions change under each scenario and then fed the results into an air quality model. The air quality model calculates the impact of ship emissions on particulate matter and ozone pollution. Finally, they estimated the effects on global public health.One of the biggest challenges came from a lack of real-world data, since no ammonia-powered ships are yet sailing the seas. Instead, the researchers relied on experimental ammonia combustion data from collaborators to build their model.“We had to come up with some clever ways to make that data useful and informative to both the technology and regulatory situations,” he says.A range of outcomesIn the end, they found that with no new regulations and ship engines that burn pure ammonia, switching the entire fleet would cause 681,000 additional premature deaths each year.“While a scenario with no new regulations is not very realistic, it serves as a good warning of how dangerous ammonia emissions could be. And unlike NOx, ammonia emissions from shipping are currently unregulated,” Wong says.However, even without new regulations, using cleaner engine technology would cut the number of premature deaths down to about 80,000, which is about 20,000 fewer than are currently attributed to maritime shipping emissions. With stronger global regulations and cleaner engine technology, the number of people killed by air pollution from shipping could be reduced by about 66,000.“The results of this study show the importance of developing policies alongside new technologies,” Selin says. “There is a potential for ammonia in shipping to be beneficial for both climate and air quality, but that requires that regulations be designed to address the entire range of potential impacts, including both climate and air quality.”Ammonia’s air quality impacts would not be felt uniformly across the globe, and addressing them fully would require coordinated strategies across very different contexts. Most premature deaths would occur in East Asia, since air quality regulations are less stringent in this region. Higher levels of existing air pollution cause the formation of more particulate matter from ammonia emissions. In addition, shipping volume over East Asia is far greater than elsewhere on Earth, compounding these negative effects.In the future, the researchers want to continue refining their analysis. They hope to use these findings as a starting point to urge the marine industry to share engine data they can use to better evaluate air quality and climate impacts. They also hope to inform policymakers about the importance and urgency of updating shipping emission regulations.This research was funded by the MIT Climate and Sustainability Consortium. More

  • in

    MIT researchers introduce generative AI for databases

    A new tool makes it easier for database users to perform complicated statistical analyses of tabular data without the need to know what is going on behind the scenes.GenSQL, a generative AI system for databases, could help users make predictions, detect anomalies, guess missing values, fix errors, or generate synthetic data with just a few keystrokes.For instance, if the system were used to analyze medical data from a patient who has always had high blood pressure, it could catch a blood pressure reading that is low for that particular patient but would otherwise be in the normal range.GenSQL automatically integrates a tabular dataset and a generative probabilistic AI model, which can account for uncertainty and adjust their decision-making based on new data.Moreover, GenSQL can be used to produce and analyze synthetic data that mimic the real data in a database. This could be especially useful in situations where sensitive data cannot be shared, such as patient health records, or when real data are sparse.This new tool is built on top of SQL, a programming language for database creation and manipulation that was introduced in the late 1970s and is used by millions of developers worldwide.“Historically, SQL taught the business world what a computer could do. They didn’t have to write custom programs, they just had to ask questions of a database in high-level language. We think that, when we move from just querying data to asking questions of models and data, we are going to need an analogous language that teaches people the coherent questions you can ask a computer that has a probabilistic model of the data,” says Vikash Mansinghka ’05, MEng ’09, PhD ’09, senior author of a paper introducing GenSQL and a principal research scientist and leader of the Probabilistic Computing Project in the MIT Department of Brain and Cognitive Sciences.When the researchers compared GenSQL to popular, AI-based approaches for data analysis, they found that it was not only faster but also produced more accurate results. Importantly, the probabilistic models used by GenSQL are explainable, so users can read and edit them.“Looking at the data and trying to find some meaningful patterns by just using some simple statistical rules might miss important interactions. You really want to capture the correlations and the dependencies of the variables, which can be quite complicated, in a model. With GenSQL, we want to enable a large set of users to query their data and their model without having to know all the details,” adds lead author Mathieu Huot, a research scientist in the Department of Brain and Cognitive Sciences and member of the Probabilistic Computing Project.They are joined on the paper by Matin Ghavami and Alexander Lew, MIT graduate students; Cameron Freer, a research scientist; Ulrich Schaechtel and Zane Shelby of Digital Garage; Martin Rinard, an MIT professor in the Department of Electrical Engineering and Computer Science and member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Feras Saad ’15, MEng ’16, PhD ’22, an assistant professor at Carnegie Mellon University. The research was recently presented at the ACM Conference on Programming Language Design and Implementation.Combining models and databasesSQL, which stands for structured query language, is a programming language for storing and manipulating information in a database. In SQL, people can ask questions about data using keywords, such as by summing, filtering, or grouping database records.However, querying a model can provide deeper insights, since models can capture what data imply for an individual. For instance, a female developer who wonders if she is underpaid is likely more interested in what salary data mean for her individually than in trends from database records.The researchers noticed that SQL didn’t provide an effective way to incorporate probabilistic AI models, but at the same time, approaches that use probabilistic models to make inferences didn’t support complex database queries.They built GenSQL to fill this gap, enabling someone to query both a dataset and a probabilistic model using a straightforward yet powerful formal programming language.A GenSQL user uploads their data and probabilistic model, which the system automatically integrates. Then, she can run queries on data that also get input from the probabilistic model running behind the scenes. This not only enables more complex queries but can also provide more accurate answers.For instance, a query in GenSQL might be something like, “How likely is it that a developer from Seattle knows the programming language Rust?” Just looking at a correlation between columns in a database might miss subtle dependencies. Incorporating a probabilistic model can capture more complex interactions.   Plus, the probabilistic models GenSQL utilizes are auditable, so people can see which data the model uses for decision-making. In addition, these models provide measures of calibrated uncertainty along with each answer.For instance, with this calibrated uncertainty, if one queries the model for predicted outcomes of different cancer treatments for a patient from a minority group that is underrepresented in the dataset, GenSQL would tell the user that it is uncertain, and how uncertain it is, rather than overconfidently advocating for the wrong treatment.Faster and more accurate resultsTo evaluate GenSQL, the researchers compared their system to popular baseline methods that use neural networks. GenSQL was between 1.7 and 6.8 times faster than these approaches, executing most queries in a few milliseconds while providing more accurate results.They also applied GenSQL in two case studies: one in which the system identified mislabeled clinical trial data and the other in which it generated accurate synthetic data that captured complex relationships in genomics.Next, the researchers want to apply GenSQL more broadly to conduct largescale modeling of human populations. With GenSQL, they can generate synthetic data to draw inferences about things like health and salary while controlling what information is used in the analysis.They also want to make GenSQL easier to use and more powerful by adding new optimizations and automation to the system. In the long run, the researchers want to enable users to make natural language queries in GenSQL. Their goal is to eventually develop a ChatGPT-like AI expert one could talk to about any database, which grounds its answers using GenSQL queries.   This research is funded, in part, by the Defense Advanced Research Projects Agency (DARPA), Google, and the Siegel Family Foundation. More

  • in

    Arvind, longtime MIT professor and prolific computer scientist, dies at 77

    Arvind Mithal, the Charles W. and Jennifer C. Johnson Professor in Computer Science and Engineering at MIT, head of the faculty of computer science in the Department of Electrical Engineering and Computer Science (EECS), and a pillar of the MIT community, died on June 17. Arvind, who went by the mononym, was 77 years old.A prolific researcher who led the Computation Structures Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL), Arvind served on the MIT faculty for nearly five decades.“He was beloved by countless people across the MIT community and around the world who were inspired by his intellectual brilliance and zest for life,” President Sally Kornbluth wrote in a letter to the MIT community today.As a scientist, Arvind was well known for important contributions to dataflow computing, which seeks to optimize the flow of data to take advantage of parallelism, achieving faster and more efficient computation.In the last 25 years, his research interests broadened to include developing techniques and tools for formal modeling, high-level synthesis, and formal verification of complex digital devices like microprocessors and hardware accelerators, as well as memory models and cache coherence protocols for parallel computing architectures and programming languages.Those who knew Arvind describe him as a rare individual whose interests and expertise ranged from high-level, theoretical formal systems all the way down through languages and compilers to the gates and structures of silicon hardware.The applications of Arvind’s work are far-reaching, from reducing the amount of energy and space required by data centers to streamlining the design of more efficient multicore computer chips.“Arvind was both a tremendous scholar in the fields of computer architecture and programming languages and a dedicated teacher, who brought systems-level thinking to our students. He was also an exceptional academic leader, often leading changes in curriculum and contributing to the Engineering Council in meaningful and impactful ways. I will greatly miss his sage advice and wisdom,” says Anantha Chandrakasan, chief innovation and strategy officer, dean of engineering, and the Vannevar Bush Professor of Electrical Engineering and Computer Science.“Arvind’s positive energy, together with his hearty laugh, brightened so many people’s lives. He was an enduring source of wise counsel for colleagues and for generations of students. With his deep commitment to academic excellence, he not only transformed research in computer architecture and parallel computing but also brought that commitment to his role as head of the computer science faculty in the EECS department. He left a lasting impact on all of us who had the privilege of working with him,” says Dan Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Ellis Warren Professor of Electrical Engineering and Computer Science.Arvind developed an interest in parallel computing while he was a student at the Indian Institute of Technology in Kanpur, from which he received his bachelor’s degree in 1969. He earned a master’s degree and PhD in computer science in 1972 and 1973, respectively, from the University of Minnesota, where he studied operating systems and mathematical models of program behavior. He taught at the University of California at Irvine from 1974 to 1978 before joining the faculty at MIT.At MIT, Arvind’s group studied parallel computing and declarative programming languages, and he led the development of two parallel computing languages, Id and pH. He continued his work on these programming languages through the 1990s, publishing the book “Implicit Parallel Programming in pH” with co-author R.S. Nikhil in 2001, the culmination of more than 20 years of research.In addition to his research, Arvind was an important academic leader in EECS. He served as head of computer science faculty in the department and played a critical role in helping with the reorganization of EECS after the establishment of the MIT Schwarzman College of Computing.“Arvind was a force of nature, larger than life in every sense. His relentless positivity, unwavering optimism, boundless generosity, and exceptional strength as a researcher was truly inspiring and left a profound mark on all who had the privilege of knowing him. I feel enormous gratitude for the light he brought into our lives and his fundamental impact on our community,” says Daniela Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science and the director of CSAIL.His work on dataflow and parallel computing led to the Monsoon project in the late 1980s and early 1990s. Arvind’s group, in collaboration with Motorola, built 16 dataflow computing machines and developed their associated software. One Monsoon dataflow machine is now in the Computer History Museum in Mountain View, California.Arvind’s focus shifted in the 1990s when, as he explained in a 2012 interview for the Institute of Electrical and Electronics Engineers (IEEE), funding for research into parallel computing began to dry up.“Microprocessors were getting so much faster that people thought they didn’t need it,” he recalled.Instead, he began applying techniques his team had learned and developed for parallel programming to the principled design of digital hardware.In addition to mentoring students and junior colleagues at MIT, Arvind also advised universities and governments in many countries on research in parallel programming and semiconductor design.Based on his work on digital hardware design, Arvind founded Sandburst in 2000, a fabless manufacturing company for semiconductor chips. He served as the company’s president for two years before returning to the MIT faculty, while continuing as an advisor. Sandburst was later acquired by Broadcom.Arvind and his students also developed Bluespec, a programming language designed to automate the design of chips. Building off this work, he co-founded the startup Bluespec, Inc., in 2003, to develop practical tools that help engineers streamline device design.Over the past decade, he was dedicated to advancing undergraduate education at MIT by bringing modern design tools to courses 6.004 (Computation Structures) and 6.191 (Introduction to Deep Learning), and incorporating Minispec, a programming language that is closely related to Bluespec.Arvind was honored for these and other contributions to data flow and multithread computing, and the development of tools for the high-level synthesis of hardware, with membership in the National Academy of Engineering in 2008 and the American Academy of Arts and Sciences in 2012. He was also named a distinguished alumnus of IIT Kanpur, his undergraduate alma mater.“Arvind was more than a pillar of the EECS community and a titan of computer science; he was a beloved colleague and a treasured friend. Those of us with the remarkable good fortune to work and collaborate with Arvind are devastated by his sudden loss. His kindness and joviality were unwavering; his mentorship was thoughtful and well-considered; his guidance was priceless. We will miss Arvind deeply,” says Asu Ozdaglar, deputy dean of the MIT Schwarzman College of Computing and head of EECS.Among numerous other awards, including membership in the Indian National Academy of Sciences and fellowship in the Association for Computing Machinery and IEEE, he received the Harry H. Goode Memorial Award from IEEE in 2012, which honors significant contributions to theory or practice in the information processing field.A humble scientist, Arvind was the first to point out that these achievements were only possible because of his outstanding and brilliant collaborators. Chief among those collaborators were the undergraduate and graduate students he felt fortunate to work with at MIT. He maintained excellent relationships with them both professionally and personally, and valued these relationships more than the work they did together, according to family members.In summing up the key to his scientific success, Arvind put it this way in the 2012 IEEE interview: “Really, one has to do what one believes in. I think the level at which most of us work, it is not sustainable if you don’t enjoy it on a day-to-day basis. You can’t work on it just because of the results. You have to work on it because you say, ‘I have to know the answer to this,’” he said.He is survived by his wife, Gita Singh Mithal, their two sons Divakar ’01 and Prabhakar ’04, their wives Leena and Nisha, and two grandchildren, Maya and Vikram.  More

  • in

    MIT-Takeda Program wraps up with 16 publications, a patent, and nearly two dozen projects completed

    When the Takeda Pharmaceutical Co. and the MIT School of Engineering launched their collaboration focused on artificial intelligence in health care and drug development in February 2020, society was on the cusp of a globe-altering pandemic and AI was far from the buzzword it is today.As the program concludes, the world looks very different. AI has become a transformative technology across industries including health care and pharmaceuticals, while the pandemic has altered the way many businesses approach health care and changed how they develop and sell medicines.For both MIT and Takeda, the program has been a game-changer.When it launched, the collaborators hoped the program would help solve tangible, real-world problems. By its end, the program has yielded a catalog of new research papers, discoveries, and lessons learned, including a patent for a system that could improve the manufacturing of small-molecule medicines.Ultimately, the program allowed both entities to create a foundation for a world where AI and machine learning play a pivotal role in medicine, leveraging Takeda’s expertise in biopharmaceuticals and the MIT researchers’ deep understanding of AI and machine learning.“The MIT-Takeda Program has been tremendously impactful and is a shining example of what can be accomplished when experts in industry and academia work together to develop solutions,” says Anantha Chandrakasan, MIT’s chief innovation and strategy officer, dean of the School of Engineering, and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “In addition to resulting in research that has advanced how we use AI and machine learning in health care, the program has opened up new opportunities for MIT faculty and students through fellowships, funding, and networking.”What made the program unique was that it was centered around several concrete challenges spanning drug development that Takeda needed help addressing. MIT faculty had the opportunity to select the projects based on their area of expertise and general interest, allowing them to explore new areas within health care and drug development.“It was focused on Takeda’s toughest business problems,” says Anne Heatherington, Takeda’s research and development chief data and technology officer and head of its Data Sciences Institute.“They were problems that colleagues were really struggling with on the ground,” adds Simon Davies, the executive director of the MIT-Takeda Program and Takeda’s global head of statistical and quantitative sciences. Takeda saw an opportunity to collaborate with MIT’s world-class researchers, who were working only a few blocks away. Takeda, a global pharmaceutical company with global headquarters in Japan, has its global business units and R&D center just down the street from the Institute.As part of the program, MIT faculty were able to select what issues they were interested in working on from a group of potential Takeda projects. Then, collaborative teams including MIT researchers and Takeda employees approached research questions in two rounds. Over the course of the program, collaborators worked on 22 projects focused on topics including drug discovery and research, clinical drug development, and pharmaceutical manufacturing. Over 80 MIT students and faculty joined more than 125 Takeda researchers and staff on teams addressing these research questions.The projects centered around not only hard problems, but also the potential for solutions to scale within Takeda or within the biopharmaceutical industry more broadly.Some of the program’s findings have already resulted in wider studies. One group’s results, for instance, showed that using artificial intelligence to analyze speech may allow for earlier detection of frontotemporal dementia, while making that diagnosis more quickly and inexpensively. Similar algorithmic analyses of speech in patients diagnosed with ALS may also help clinicians understand the progression of that disease. Takeda is continuing to test both AI applications.Other discoveries and AI models that resulted from the program’s research have already had an impact. Using a physical model and AI learning algorithms can help detect particle size, mix, and consistency for powdered, small-molecule medicines, for instance, speeding up production timelines. Based on their research under the program, collaborators have filed for a patent for that technology.For injectable medicines like vaccines, AI-enabled inspections can also reduce process time and false rejection rates. Replacing human visual inspections with AI processes has already shown measurable impact for the pharmaceutical company.Heatherington adds, “our lessons learned are really setting the stage for what we’re doing next, really embedding AI and gen-AI [generative AI] into everything that we do moving forward.”Over the course of the program, more than 150 Takeda researchers and staff also participated in educational programming organized by the Abdul Latif Jameel Clinic for Machine Learning in Health. In addition to providing research opportunities, the program funded 10 students through SuperUROP, the Advanced Undergraduate Research Opportunities Program, as well as two cohorts from the DHIVE health-care innovation program, part of the MIT Sandbox Innovation Fund Program.Though the formal program has ended, certain aspects of the collaboration will continue, such as the MIT-Takeda Fellows, which supports graduate students as they pursue groundbreaking research related to health and AI. During its run, the program supported 44 MIT-Takeda Fellows and will continue to support MIT students through an endowment fund. Organic collaboration between MIT and Takeda researchers will also carry forward. And the programs’ collaborators are working to create a model for similar academic and industry partnerships to widen the impact of this first-of-its-kind collaboration.  More

  • in

    Researchers use large language models to help robots navigate

    Someday, you may want your home robot to carry a load of dirty clothes downstairs and deposit them in the washing machine in the far-left corner of the basement. The robot will need to combine your instructions with its visual observations to determine the steps it should take to complete this task.For an AI agent, this is easier said than done. Current approaches often utilize multiple hand-crafted machine-learning models to tackle different parts of the task, which require a great deal of human effort and expertise to build. These methods, which use visual representations to directly make navigation decisions, demand massive amounts of visual data for training, which are often hard to come by.To overcome these challenges, researchers from MIT and the MIT-IBM Watson AI Lab devised a navigation method that converts visual representations into pieces of language, which are then fed into one large language model that achieves all parts of the multistep navigation task.Rather than encoding visual features from images of a robot’s surroundings as visual representations, which is computationally intensive, their method creates text captions that describe the robot’s point-of-view. A large language model uses the captions to predict the actions a robot should take to fulfill a user’s language-based instructions.Because their method utilizes purely language-based representations, they can use a large language model to efficiently generate a huge amount of synthetic training data.While this approach does not outperform techniques that use visual features, it performs well in situations that lack enough visual data for training. The researchers found that combining their language-based inputs with visual signals leads to better navigation performance.“By purely using language as the perceptual representation, ours is a more straightforward approach. Since all the inputs can be encoded as language, we can generate a human-understandable trajectory,” says Bowen Pan, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on this approach.Pan’s co-authors include his advisor, Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing, MIT director of the MIT-IBM Watson AI Lab, and a senior research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL); Philip Isola, an associate professor of EECS and a member of CSAIL; senior author Yoon Kim, an assistant professor of EECS and a member of CSAIL; and others at the MIT-IBM Watson AI Lab and Dartmouth College. The research will be presented at the Conference of the North American Chapter of the Association for Computational Linguistics.Solving a vision problem with languageSince large language models are the most powerful machine-learning models available, the researchers sought to incorporate them into the complex task known as vision-and-language navigation, Pan says.But such models take text-based inputs and can’t process visual data from a robot’s camera. So, the team needed to find a way to use language instead.Their technique utilizes a simple captioning model to obtain text descriptions of a robot’s visual observations. These captions are combined with language-based instructions and fed into a large language model, which decides what navigation step the robot should take next.The large language model outputs a caption of the scene the robot should see after completing that step. This is used to update the trajectory history so the robot can keep track of where it has been.The model repeats these processes to generate a trajectory that guides the robot to its goal, one step at a time.To streamline the process, the researchers designed templates so observation information is presented to the model in a standard form — as a series of choices the robot can make based on its surroundings.For instance, a caption might say “to your 30-degree left is a door with a potted plant beside it, to your back is a small office with a desk and a computer,” etc. The model chooses whether the robot should move toward the door or the office.“One of the biggest challenges was figuring out how to encode this kind of information into language in a proper way to make the agent understand what the task is and how they should respond,” Pan says.Advantages of languageWhen they tested this approach, while it could not outperform vision-based techniques, they found that it offered several advantages.First, because text requires fewer computational resources to synthesize than complex image data, their method can be used to rapidly generate synthetic training data. In one test, they generated 10,000 synthetic trajectories based on 10 real-world, visual trajectories.The technique can also bridge the gap that can prevent an agent trained with a simulated environment from performing well in the real world. This gap often occurs because computer-generated images can appear quite different from real-world scenes due to elements like lighting or color. But language that describes a synthetic versus a real image would be much harder to tell apart, Pan says. Also, the representations their model uses are easier for a human to understand because they are written in natural language.“If the agent fails to reach its goal, we can more easily determine where it failed and why it failed. Maybe the history information is not clear enough or the observation ignores some important details,” Pan says.In addition, their method could be applied more easily to varied tasks and environments because it uses only one type of input. As long as data can be encoded as language, they can use the same model without making any modifications.But one disadvantage is that their method naturally loses some information that would be captured by vision-based models, such as depth information.However, the researchers were surprised to see that combining language-based representations with vision-based methods improves an agent’s ability to navigate.“Maybe this means that language can capture some higher-level information than cannot be captured with pure vision features,” he says.This is one area the researchers want to continue exploring. They also want to develop a navigation-oriented captioner that could boost the method’s performance. In addition, they want to probe the ability of large language models to exhibit spatial awareness and see how this could aid language-based navigation.This research is funded, in part, by the MIT-IBM Watson AI Lab. More