More stories

  • in

    MIT ReACT welcomes first Afghan cohort to its largest-yet certificate program

    Through the championing support of the faculty and leadership of the MIT Afghan Working Group convened last September by Provost Martin Schmidt and chaired by Associate Provost for International Activities Richard Lester, MIT has come together to support displaced Afghan learners and scholars in a time of crisis. The MIT Refugee Action Hub (ReACT) has opened opportunities for 25 talented Afghan learners to participate in the hub’s certificate program in computer and data science (CDS), now in its fourth year, welcoming its largest and most diverse cohort to date — 136 learners from 29 countries.

    ”Even in the face of extreme disruption, education and scholarship must continue, and MIT is committed to providing resources and safe forums for displaced scholars,” says Lester. “We greatly appreciate MIT ReACT’s work to create learning opportunities for Afghan students whose lives have been upended by the crisis in their homeland.”

    Currently, more than 3.5 million Afghans are internally displaced, while 2.5 million are registered refugees residing in other parts of the world. With millions in Afghanistan facing famine, poverty, and civil unrest in what has become the world’s largest humanitarian crisis, the United Nations predicts the number of Afghans forced to flee their homes will continue to rise. 

    “Forced displacement is on the rise, fueled not only by constant political, economical, and social turmoil worldwide, but also by the ongoing climate change crisis, which threatens costly disruptions to society and has potential to create unprecedented displacement internationally,” says associate professor of civil and environmental engineering and ReACT’s faculty founder Admir Masic. During the orientation for the new CDS cohort in January, Masic emphasized the great need for educational programs like ReACT’s that address the specific challenges refugees and displaced learners face.

    A former Bosnian refugee, Masic spent his teenage years in Croatia, where educational opportunities were limited for young people with refugee status. His experience motivated him to found ReACT, which launched in 2017. Housed within Open Learning, ReACT is an MIT-wide effort to deliver global education and professional development programs to underserved communities, including refugees and migrants. ReACT’s signature program, CDS is a year-long, online program that combines MITx courses in programming and data science, personal and professional development workshops including MIT Bootcamps, and opportunities for practical experience.

    ReACT’s group of 25 learners from Afghanistan, 52 percent of whom are women, joins the larger CDS cohort in the program. They will receive support from their new colleagues as well as members of ReACT’s mentor and alumni network. While the majority of the group are residing around the world, including in Europe, North America, and neighboring countries, several still remain in Afghanistan. With the support of the Afghan Working Group, ReACT is working to connect with communities from the region to provide safe and inclusive learning environments for the cohort. ​​

    Building community and confidence

    Selected from more than 1,000 applicants, the new CDS cohort reflected on their personal and professional goals during a weeklong orientation.

    “I am here because I want to change my career and learn basics in this field to then obtain networks that I wouldn’t have got if it weren’t for this program,” said Samiullah Ajmal, who is joining the program from Afghanistan.

    Interactive workshops on topics such as leadership development and virtual networking rounded out the week’s events. Members of ReACT’s greater community — which has grown in recent years to include a network of external collaborators including nonprofits, philanthropic supporters, universities, and alumni — helped facilitate these workshops and other orientation activities.

    For instance, Na’amal, a social enterprise that connects refugees to remote work opportunities, introduced the CDS learners to strategies for making career connections remotely. “We build confidence while doing,” says Susan Mulholland, a leadership and development coach with Na’amal who led the networking workshop.

    Along with the CDS program’s cohort-based model, ReACT also uses platforms that encourage regular communication between participants and with the larger ReACT network — making connections a critical component of the program.

    “I not only want to meet new people and make connections for my professional career, but I also want to test my communication and social skills,” says Pablo Andrés Uribe, a learner who lives in Colombia, describing ReACT’s emphasis on community-building. 

    Over the last two years, ReACT has expanded its geographic presence, growing from a hub in Jordan into a robust global community of many hubs, including in Colombia and Uganda. These regional sites connect talented refugees and displaced learners to internships and employment, startup networks and accelerators, and pathways to formal undergraduate and graduate education.

    This expansion is thanks to the generous support internally from the MIT Office of the Provost and Associate Provost Richard Lester and external organizations including the Western Union Foundation. ReACT will build new hubs this year in Greece, Uruguay, and Afghanistan, as a result of gifts from the Hatsopoulos family and the Pfeffer family.

    Holding space to learn from each other

    In addition to establishing new global hubs, ReACT plans to expand its network of internship and experiential learning opportunities, increasing outreach to new collaborators such as nongovernmental organizations (NGOs), companies, and universities. Jointly with Na’amal and Paper Airplanes, a nonprofit that connects conflict-affected individuals with personal language tutors, ReACT will host the first Migration Summit. Scheduled for April 2022, the month-long global convening invites a broad range of participants, including displaced learners, universities, companies, nonprofits and NGOs, social enterprises, foundations, philanthropists, researchers, policymakers, employers, and governments, to address the key challenges and opportunities for refugee and migrant communities. The theme of the summit is “Education and Workforce Development in Displacement.”

    “The MIT Migration Summit offers a platform to discuss how new educational models, such as those employed in ReACT, can help solve emerging challenges in providing quality education and career opportunities to forcibly displaced and marginalized people around the world,” says Masic. 

    A key goal of the convening is to center the voices of those most directly impacted by displacement, such as ReACT’s learners from Afghanistan and elsewhere, in solution-making. More

  • in

    Unlocking new doors to artificial intelligence

    Artificial intelligence research is constantly developing new hypotheses that have the potential to benefit society and industry; however, sometimes these benefits are not fully realized due to a lack of engineering tools. To help bridge this gap, graduate students in the MIT Department of Electrical Engineering and Computer Science’s 6-A Master of Engineering (MEng) Thesis Program work with some of the most innovative companies in the world and collaborate on cutting-edge projects, while contributing to and completing their MEng thesis.

    During a portion of the last year, four 6-A MEng students teamed up and completed an internship with IBM Research’s advanced prototyping team through the MIT-IBM Watson AI Lab on AI projects, often developing web applications to solve a real-world issue or business use cases. Here, the students worked alongside AI engineers, user experience engineers, full-stack researchers, and generalists to accommodate project requests and receive thesis advice, says Lee Martie, IBM research staff member and 6-A manager. The students’ projects ranged from generating synthetic data to allow for privacy-sensitive data analysis to using computer vision to identify actions in video that allows for monitoring human safety and tracking build progress on a construction site.

    “I appreciated all of the expertise from the team and the feedback,” says 6-A graduate Violetta Jusiega ’21, who participated in the program. “I think that working in industry gives the lens of making sure that the project’s needs are satisfied and [provides the opportunity] to ground research and make sure that it is helpful for some use case in the future.”

    Jusiega’s research intersected the fields of computer vision and design to focus on data visualization and user interfaces for the medical field. Working with IBM, she built an application programming interface (API) that let clinicians interact with a medical treatment strategy AI model, which was deployed in the cloud. Her interface provided a medical decision tree, as well as some prescribed treatment plans. After receiving feedback on her design from physicians at a local hospital, Jusiega developed iterations of the API and how the results where displayed, visually, so that it would be user-friendly and understandable for clinicians, who don’t usually code. She says that, “these tools are often not acquired into the field because they lack some of these API principles which become more important in an industry where everything is already very fast paced, so there’s little time to incorporate a new technology.” But this project might eventually allow for industry deployment. “I think this application has a bunch of potential, whether it does get picked up by clinicians or whether it’s simply used in research. It’s very promising and very exciting to see how technology can help us modify, or I can improve, the health-care field to be even more custom-tailored towards patients and giving them the best care possible,” she says.

    Another 6-A graduate student, Spencer Compton, was also considering aiding professionals to make more informed decisions, for use in settings including health care, but he was tackling it from a causal perspective. When given a set of related variables, Compton was investigating if there was a way to determine not just correlation, but the cause-and-effect relationship between them (the direction of the interaction) from the data alone. For this, he and his collaborators from IBM Research and Purdue University turned to a field of math called information theory. With the goal of designing an algorithm to learn complex networks of causal relationships, Compton used ideas relating to entropy, the randomness in a system, to help determine if a causal relationship is present and how variables might be interacting. “When judging an explanation, people often default to Occam’s razor” says Compton. “We’re more inclined to believe a simpler explanation than a more complex one.” In many cases, he says, it seemed to perform well. For instance, they were able to consider variables such as lung cancer, pollution, and X-ray findings. He was pleased that his research allowed him to help create a framework of “entropic causal inference” that could aid in safe and smart decisions in the future, in a satisfying way. “The math is really surprisingly deep, interesting, and complex,” says Compton. “We’re basically asking, ‘when is the simplest explanation correct?’ but as a math question.”

    Determining relationships within data can sometimes require large volumes of it to suss out patterns, but for data that may contain sensitive information, this may not be available. For her master’s work, Ivy Huang worked with IBM Research to generate synthetic tabular data using a natural language processing tool called a transformer model, which can learn and predict future values from past values. Trained on real data, the model can produce new data with similar patterns, properties, and relationships without restrictions like privacy, availability, and access that might come with real data in financial transactions and electronic medical records. Further, she created an API and deployed the model in an IBM cluster, which allowed users increased access to the model and abilities to query it without compromising the original data.

    Working with the advanced prototyping team, MEng candidate Brandon Perez also considered how to gather and investigate data with restrictions, but in his case it was to use computer vision frameworks, centered on an action recognition model, to identify construction site happenings. The team based their work on the Moments in Time dataset, which contains over a million three-second video clips with about 300 attached classification labels, and has performed well during AI training. However, the group needed more construction-based video data. For this, they used YouTube-8M. Perez built a framework for testing and fine-tuning existing object detection models and action recognition models that could plug into an automatic spatial and temporal localization tool — how they would identify and label particular actions in a video timeline. “I was satisfied that I was able to explore what made me curious, and I was grateful for the autonomy that I was given with this project,” says Perez. “I felt like I was always supported, and my mentor was a great support to the project.”

    “The kind of collaborations that we have seen between our MEng students and IBM researchers are exactly what the 6-A MEng Thesis program at MIT is all about,” says Tomas Palacios, professor of electrical engineering and faculty director of the MIT 6-A MEng Thesis program. “For more than 100 years, 6-A has been connecting MIT students with industry to solve together some of the most important problems in the world.” More

  • in

    Injecting fairness into machine-learning models

    If a machine-learning model is trained using an unbalanced dataset, such as one that contains far more images of people with lighter skin than people with darker skin, there is serious risk the model’s predictions will be unfair when it is deployed in the real world.

    But this is only one part of the problem. MIT researchers have found that machine-learning models that are popular for image recognition tasks actually encode bias when trained on unbalanced data. This bias within the model is impossible to fix later on, even with state-of-the-art fairness-boosting techniques, and even when retraining the model with a balanced dataset.      

    So, the researchers came up with a technique to introduce fairness directly into the model’s internal representation itself. This enables the model to produce fair outputs even if it is trained on unfair data, which is especially important because there are very few well-balanced datasets for machine learning.

    The solution they developed not only leads to models that make more balanced predictions, but also improves their performance on downstream tasks like facial recognition and animal species classification.

    “In machine learning, it is common to blame the data for bias in models. But we don’t always have balanced data. So, we need to come up with methods that actually fix the problem with imbalanced data,” says lead author Natalie Dullerud, a graduate student in the Healthy ML Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT.

    Dullerud’s co-authors include Kimia Hamidieh, a graduate student in the Healthy ML Group; Karsten Roth, a former visiting researcher who is now a graduate student at the University of Tubingen; Nicolas Papernot, an assistant professor in the University of Toronto’s Department of Electrical Engineering and Computer Science; and senior author Marzyeh Ghassemi, an assistant professor and head of the Healthy ML Group. The research will be presented at the International Conference on Learning Representations.

    Defining fairness

    The machine-learning technique the researchers studied is known as deep metric learning, which is a broad form of representation learning. In deep metric learning, a neural network learns the similarity between objects by mapping similar photos close together and dissimilar photos far apart. During training, this neural network maps images in an “embedding space” where a similarity metric between photos corresponds to the distance between them.

    For example, if a deep metric learning model is being used to classify bird species, it will map photos of golden finches together in one part of the embedding space and cardinals together in another part of the embedding space. Once trained, the model can effectively measure the similarity of new images it hasn’t seen before. It would learn to cluster images of an unseen bird species close together, but farther from cardinals or golden finches within the embedding space.

    The similarity metrics the model learns are very robust, which is why deep metric learning is so often employed for facial recognition, Dullerud says. But she and her colleagues wondered how to determine if a similarity metric is biased.

    “We know that data reflect the biases of processes in society. This means we have to shift our focus to designing methods that are better suited to reality,” says Ghassemi.

    The researchers defined two ways that a similarity metric can be unfair. Using the example of facial recognition, the metric will be unfair if it is more likely to embed individuals with darker-skinned faces closer to each other, even if they are not the same person, than it would if those images were people with lighter-skinned faces. Second, it will be unfair if the features it learns for measuring similarity are better for the majority group than for the minority group.

    The researchers ran a number of experiments on models with unfair similarity metrics and were unable to overcome the bias the model had learned in its embedding space.

    “This is quite scary because it is a very common practice for companies to release these embedding models and then people finetune them for some downstream classification task. But no matter what you do downstream, you simply can’t fix the fairness problems that were induced in the embedding space,” Dullerud says.

    Even if a user retrains the model on a balanced dataset for the downstream task, which is the best-case scenario for fixing the fairness problem, there are still performance gaps of at least 20 percent, she says.

    The only way to solve this problem is to ensure the embedding space is fair to begin with.

    Learning separate metrics

    The researchers’ solution, called Partial Attribute Decorrelation (PARADE), involves training the model to learn a separate similarity metric for a sensitive attribute, like skin tone, and then decorrelating the skin tone similarity metric from the targeted similarity metric. If the model is learning the similarity metrics of different human faces, it will learn to map similar faces close together and dissimilar faces far apart using features other than skin tone.

    Any number of sensitive attributes can be decorrelated from the targeted similarity metric in this way. And because the similarity metric for the sensitive attribute is learned in a separate embedding space, it is discarded after training so only the targeted similarity metric remains in the model.

    Their method is applicable to many situations because the user can control the amount of decorrelation between similarity metrics. For instance, if the model will be diagnosing breast cancer from mammogram images, a clinician likely wants some information about biological sex to remain in the final embedding space because it is much more likely that women will have breast cancer than men, Dullerud explains.

    They tested their method on two tasks, facial recognition and classifying bird species, and found that it reduced performance gaps caused by bias, both in the embedding space and in the downstream task, regardless of the dataset they used.

    Moving forward, Dullerud is interested in studying how to force a deep metric learning model to learn good features in the first place.

    “How do you properly audit fairness? That is an open question right now. How can you tell that a model is going to be fair, or that it is only going to be fair in certain situations, and what are those situations? Those are questions I am really interested in moving forward,” she says. More

  • in

    Using artificial intelligence to find anomalies hiding in massive datasets

    Identifying a malfunction in the nation’s power grid can be like trying to find a needle in an enormous haystack. Hundreds of thousands of interrelated sensors spread across the U.S. capture data on electric current, voltage, and other critical information in real time, often taking multiple recordings per second.

    Researchers at the MIT-IBM Watson AI Lab have devised a computationally efficient method that can automatically pinpoint anomalies in those data streams in real time. They demonstrated that their artificial intelligence method, which learns to model the interconnectedness of the power grid, is much better at detecting these glitches than some other popular techniques.

    Because the machine-learning model they developed does not require annotated data on power grid anomalies for training, it would be easier to apply in real-world situations where high-quality, labeled datasets are often hard to come by. The model is also flexible and can be applied to other situations where a vast number of interconnected sensors collect and report data, like traffic monitoring systems. It could, for example, identify traffic bottlenecks or reveal how traffic jams cascade.

    “In the case of a power grid, people have tried to capture the data using statistics and then define detection rules with domain knowledge to say that, for example, if the voltage surges by a certain percentage, then the grid operator should be alerted. Such rule-based systems, even empowered by statistical data analysis, require a lot of labor and expertise. We show that we can automate this process and also learn patterns from the data using advanced machine-learning techniques,” says senior author Jie Chen, a research staff member and manager of the MIT-IBM Watson AI Lab.

    The co-author is Enyan Dai, an MIT-IBM Watson AI Lab intern and graduate student at the Pennsylvania State University. This research will be presented at the International Conference on Learning Representations.

    Probing probabilities

    The researchers began by defining an anomaly as an event that has a low probability of occurring, like a sudden spike in voltage. They treat the power grid data as a probability distribution, so if they can estimate the probability densities, they can identify the low-density values in the dataset. Those data points which are least likely to occur correspond to anomalies.

    Estimating those probabilities is no easy task, especially since each sample captures multiple time series, and each time series is a set of multidimensional data points recorded over time. Plus, the sensors that capture all that data are conditional on one another, meaning they are connected in a certain configuration and one sensor can sometimes impact others.

    To learn the complex conditional probability distribution of the data, the researchers used a special type of deep-learning model called a normalizing flow, which is particularly effective at estimating the probability density of a sample.

    They augmented that normalizing flow model using a type of graph, known as a Bayesian network, which can learn the complex, causal relationship structure between different sensors. This graph structure enables the researchers to see patterns in the data and estimate anomalies more accurately, Chen explains.

    “The sensors are interacting with each other, and they have causal relationships and depend on each other. So, we have to be able to inject this dependency information into the way that we compute the probabilities,” he says.

    This Bayesian network factorizes, or breaks down, the joint probability of the multiple time series data into less complex, conditional probabilities that are much easier to parameterize, learn, and evaluate. This allows the researchers to estimate the likelihood of observing certain sensor readings, and to identify those readings that have a low probability of occurring, meaning they are anomalies.

    Their method is especially powerful because this complex graph structure does not need to be defined in advance — the model can learn the graph on its own, in an unsupervised manner.

    A powerful technique

    They tested this framework by seeing how well it could identify anomalies in power grid data, traffic data, and water system data. The datasets they used for testing contained anomalies that had been identified by humans, so the researchers were able to compare the anomalies their model identified with real glitches in each system.

    Their model outperformed all the baselines by detecting a higher percentage of true anomalies in each dataset.

    “For the baselines, a lot of them don’t incorporate graph structure. That perfectly corroborates our hypothesis. Figuring out the dependency relationships between the different nodes in the graph is definitely helping us,” Chen says.

    Their methodology is also flexible. Armed with a large, unlabeled dataset, they can tune the model to make effective anomaly predictions in other situations, like traffic patterns.

    Once the model is deployed, it would continue to learn from a steady stream of new sensor data, adapting to possible drift of the data distribution and maintaining accuracy over time, says Chen.

    Though this particular project is close to its end, he looks forward to applying the lessons he learned to other areas of deep-learning research, particularly on graphs.

    Chen and his colleagues could use this approach to develop models that map other complex, conditional relationships. They also want to explore how they can efficiently learn these models when the graphs become enormous, perhaps with millions or billions of interconnected nodes. And rather than finding anomalies, they could also use this approach to improve the accuracy of forecasts based on datasets or streamline other classification techniques.

    This work was funded by the MIT-IBM Watson AI Lab and the U.S. Department of Energy. More

  • in

    Deep-learning technique predicts clinical treatment outcomes

    When it comes to treatment strategies for critically ill patients, clinicians want to be able to consider all their options and timing of administration, and make the optimal decision for their patients. While clinician experience and study has helped them to be successful in this effort, not all patients are the same, and treatment decisions at this crucial time could mean the difference between patient improvement and quick deterioration. Therefore, it would be helpful for doctors to be able to take a patient’s previous known health status and received treatments and use that to predict that patient’s health outcome under different treatment scenarios, in order to pick the best path.

    Now, a deep-learning technique, called G-Net, from researchers at MIT and IBM provides a window into causal counterfactual prediction, affording physicians the opportunity to explore how a patient might fare under different treatment plans. The foundation of G-Net is the g-computation algorithm, a causal inference method that estimates the effect of dynamic exposures in the presence of measured confounding variables — ones that may influence both treatments and outcomes. Unlike previous implementations of the g-computation framework, which have used linear modeling approaches, G-Net uses recurrent neural networks (RNN), which have node connections that allow them to better model temporal sequences with complex and nonlinear dynamics, like those found in the physiological and clinical time series data. In this way, physicians can develop alternative plans based on patient history and test them before making a decision.

    “Our ultimate goal is to develop a machine learning technique that would allow doctors to explore various ‘What if’ scenarios and treatment options,” says Li-wei Lehman, MIT research scientist in the MIT Institute for Medical Engineering and Science and an MIT-IBM Watson AI Lab project lead. “A lot of work has been done in terms of deep learning for counterfactual prediction but [it’s] been focusing on a point exposure setting,” or a static, time-varying treatment strategy, which doesn’t allow for adjustment of treatments as patient history changes. However, her team’s new prediction approach provides for treatment plan flexibility and chances for treatment alteration over time as patient covariate history and past treatments change. “G-Net is the first deep-learning approach based on g-computation that can predict both the population-level and individual-level treatment effects under dynamic and time varying treatment strategies.”

    The research, which was recently published in the Proceedings of Machine Learning Research, was co-authored by Rui Li MEng ’20, Stephanie Hu MEng ’21, former MIT postdoc Mingyu Lu MD, graduate student Yuria Utsumi, IBM research staff member Prithwish Chakraborty, IBM Research director of Hybrid Cloud Services Daby Sow, IBM data scientist Piyush Madan, IBM research scientist Mohamed Ghalwash, and IBM research scientist Zach Shahn.

    Tracking disease progression

    To build, validate, and test G-Net’s predictive abilities, the researchers considered the circulatory system in septic patients in the ICU. During critical care, doctors need to make trade-offs and judgement calls, such as ensuring the organs are receiving adequate blood supply without overworking the heart. For this, they could give intravenous fluids to patients to increase blood pressure; however, too much can cause edema. Alternatively, physicians can administer vasopressors, which act to contract blood vessels and raise blood pressure.

    In order to mimic this and demonstrate G-Net’s proof-of-concept, the team used CVSim, a mechanistic model of a human cardiovascular system that’s governed by 28 input variables characterizing the system’s current state, such as arterial pressure, central venous pressure, total blood volume, and total peripheral resistance, and modified it to simulate various disease processes (e.g., sepsis or blood loss) and effects of interventions (e.g., fluids and vasopressors). The researchers used CVSim to generate observational patient data for training and for “ground truth” comparison against counterfactual prediction. In their G-Net architecture, the researchers ran two RNNs to handle and predict variables that are continuous, meaning they can take on a range of values, like blood pressure, and categorical variables, which have discrete values, like the presence or absence of pulmonary edema. The researchers simulated the health trajectories of thousands of “patients” exhibiting symptoms under one treatment regime, let’s say A, for 66 timesteps, and used them to train and validate their model.

    Testing G-Net’s prediction capability, the team generated two counterfactual datasets. Each contained roughly 1,000 known patient health trajectories, which were created from CVSim using the same “patient” condition as the starting point under treatment A. Then at timestep 33, treatment changed to plan B or C, depending on the dataset. The team then performed 100 prediction trajectories for each of these 1,000 patients, whose treatment and medical history was known up until timestep 33 when a new treatment was administered. In these cases, the prediction agreed well with the “ground-truth” observations for individual patients and averaged population-level trajectories.

    A cut above the rest

    Since the g-computation framework is flexible, the researchers wanted to examine G-Net’s prediction using different nonlinear models — in this case, long short-term memory (LSTM) models, which are a type of RNN that can learn from previous data patterns or sequences — against the more classical linear models and a multilayer perception model (MLP), a type of neural network that can make predictions using a nonlinear approach. Following a similar setup as before, the team found that the error between the known and predicted cases was smallest in the LSTM models compared to the others. Since G-Net is able to model the temporal patterns of the patient’s ICU history and past treatment, whereas a linear model and MLP cannot, it was better able to predict the patient’s outcome.

    The team also compared G-Net’s prediction in a static, time-varying treatment setting against two state-of-the-art deep-learning based counterfactual prediction approaches, a recurrent marginal structural network (rMSN) and a counterfactual recurrent neural network (CRN), as well as a linear model and an MLP. For this, they investigated a model for tumor growth under no treatment, radiation, chemotherapy, and both radiation and chemotherapy scenarios. “Imagine a scenario where there’s a patient with cancer, and an example of a static regime would be if you only give a fixed dosage of chemotherapy, radiation, or any kind of drug, and wait until the end of your trajectory,” comments Lu. For these investigations, the researchers generated simulated observational data using tumor volume as the primary influence dictating treatment plans and demonstrated that G-Net outperformed the other models. One potential reason could be because g-computation is known to be more statistically efficient than rMSN and CRN, when models are correctly specified.

    While G-Net has done well with simulated data, more needs to be done before it can be applied to real patients. Since neural networks can be thought of as “black boxes” for prediction results, the researchers are beginning to investigate the uncertainty in the model to help ensure safety. In contrast to these approaches that recommend an “optimal” treatment plan without any clinician involvement, “as a decision support tool, I believe that G-Net would be more interpretable, since the clinicians would input treatment strategies themselves,” says Lehman, and “G-Net will allow them to be able to explore different hypotheses.” Further, the team has moved on to using real data from ICU patients with sepsis, bringing it one step closer to implementation in hospitals.

    “I think it is pretty important and exciting for real-world applications,” says Hu. “It’d be helpful to have some way to predict whether or not a treatment might work or what the effects might be — a quicker iteration process for developing these hypotheses for what to try, before actually trying to implement them in in a years-long, potentially very involved and very invasive type of clinical trial.”

    This research was funded by the MIT-IBM Watson AI Lab. More

  • in

    The downside of machine learning in health care

    While working toward her dissertation in computer science at MIT, Marzyeh Ghassemi wrote several papers on how machine-learning techniques from artificial intelligence could be applied to clinical data in order to predict patient outcomes. “It wasn’t until the end of my PhD work that one of my committee members asked: ‘Did you ever check to see how well your model worked across different groups of people?’”

    That question was eye-opening for Ghassemi, who had previously assessed the performance of models in aggregate, across all patients. Upon a closer look, she saw that models often worked differently — specifically worse — for populations including Black women, a revelation that took her by surprise. “I hadn’t made the connection beforehand that health disparities would translate directly to model disparities,” she says. “And given that I am a visible minority woman-identifying computer scientist at MIT, I am reasonably certain that many others weren’t aware of this either.”

    In a paper published Jan. 14 in the journal Patterns, Ghassemi — who earned her doctorate in 2017 and is now an assistant professor in the Department of Electrical Engineering and Computer Science and the MIT Institute for Medical Engineering and Science (IMES) — and her coauthor, Elaine Okanyene Nsoesie of Boston University, offer a cautionary note about the prospects for AI in medicine. “If used carefully, this technology could improve performance in health care and potentially reduce inequities,” Ghassemi says. “But if we’re not actually careful, technology could worsen care.”

    It all comes down to data, given that the AI tools in question train themselves by processing and analyzing vast quantities of data. But the data they are given are produced by humans, who are fallible and whose judgments may be clouded by the fact that they interact differently with patients depending on their age, gender, and race, without even knowing it.

    Furthermore, there is still great uncertainty about medical conditions themselves. “Doctors trained at the same medical school for 10 years can, and often do, disagree about a patient’s diagnosis,” Ghassemi says. That’s different from the applications where existing machine-learning algorithms excel — like object-recognition tasks — because practically everyone in the world will agree that a dog is, in fact, a dog.

    Machine-learning algorithms have also fared well in mastering games like chess and Go, where both the rules and the “win conditions” are clearly defined. Physicians, however, don’t always concur on the rules for treating patients, and even the win condition of being “healthy” is not widely agreed upon. “Doctors know what it means to be sick,” Ghassemi explains, “and we have the most data for people when they are sickest. But we don’t get much data from people when they are healthy because they’re less likely to see doctors then.”

    Even mechanical devices can contribute to flawed data and disparities in treatment. Pulse oximeters, for example, which have been calibrated predominately on light-skinned individuals, do not accurately measure blood oxygen levels for people with darker skin. And these deficiencies are most acute when oxygen levels are low — precisely when accurate readings are most urgent. Similarly, women face increased risks during “metal-on-metal” hip replacements, Ghassemi and Nsoesie write, “due in part to anatomic differences that aren’t taken into account in implant design.” Facts like these could be buried within the data fed to computer models whose output will be undermined as a result.

    Coming from computers, the product of machine-learning algorithms offers “the sheen of objectivity,” according to Ghassemi. But that can be deceptive and dangerous, because it’s harder to ferret out the faulty data supplied en masse to a computer than it is to discount the recommendations of a single possibly inept (and maybe even racist) doctor. “The problem is not machine learning itself,” she insists. “It’s people. Human caregivers generate bad data sometimes because they are not perfect.”

    Nevertheless, she still believes that machine learning can offer benefits in health care in terms of more efficient and fairer recommendations and practices. One key to realizing the promise of machine learning in health care is to improve the quality of data, which is no easy task. “Imagine if we could take data from doctors that have the best performance and share that with other doctors that have less training and experience,” Ghassemi says. “We really need to collect this data and audit it.”

    The challenge here is that the collection of data is not incentivized or rewarded, she notes. “It’s not easy to get a grant for that, or ask students to spend time on it. And data providers might say, ‘Why should I give my data out for free when I can sell it to a company for millions?’ But researchers should be able to access data without having to deal with questions like: ‘What paper will I get my name on in exchange for giving you access to data that sits at my institution?’

    “The only way to get better health care is to get better data,” Ghassemi says, “and the only way to get better data is to incentivize its release.”

    It’s not only a question of collecting data. There’s also the matter of who will collect it and vet it. Ghassemi recommends assembling diverse groups of researchers — clinicians, statisticians, medical ethicists, and computer scientists — to first gather diverse patient data and then “focus on developing fair and equitable improvements in health care that can be deployed in not just one advanced medical setting, but in a wide range of medical settings.”

    The objective of the Patterns paper is not to discourage technologists from bringing their expertise in machine learning to the medical world, she says. “They just need to be cognizant of the gaps that appear in treatment and other complexities that ought to be considered before giving their stamp of approval to a particular computer model.” More

  • in

    Professors Elchanan Mossel and Rosalind Picard named 2021 ACM Fellows

    The Association for Computing Machinery (ACM) has named MIT professors Elchanan Mossel and Rosalind Picard as fellows for outstanding accomplishments in computing and information technology.

    The ACM Fellows program recognizes wide-ranging and fundamental contributions in areas including algorithms, computer science education, cryptography, data security and privacy, medical informatics, and mobile and networked systems, among many other areas. The accomplishments of the 2021 ACM Fellows underpin important innovations that shape the technologies we use every day.

    Elchanan Mossel

    Mossel is a professor of mathematics and a member at the Statistics and Data Science Center of the MIT Institute for Data, Systems and Society. His research in discrete functional inequalities, isoperimetry, and hypercontractivity led to the proof that Majority is Stablest and confirmed the optimality of the Goemans-Williamson MAX-CUT algorithm under the unique games conjecture from computational complexity. His work on the reconstruction problem on trees provides optimal algorithms and bounds for phylogenetic reconstruction in molecular biology and has led to sharp results in the analysis of Gibbs samplers from statistical physics and inference problems on graphs. His research has resolved open problems in computational biology, machine learning, social choice theory, and economics.Mossel received a BS from the Open University in Israel in 1992, and MS (1997) and PhD (2000) degrees in mathematics from the Hebrew University of Jerusalem. He was a postdoc at the Microsoft Research Theory Group and a Miller Fellow at University of California at Berkeley. He joined the UC Berkeley faculty in 2003 as a professor of statistics and computer science, and spent leaves as a professor at the Weizmann Institute and at the Wharton School before joining MIT in 2016 as a full professor.

    In 2020, he received the Vannevar Bush Faculty Fellowship of the U.S. Department of Defense. Other distinctions include being named a Simons Investigator in Mathematics in 2019, being selected as a fellow of the AMS, and receiving a Sloan Research Fellowship, NSF CAREER Award, and the Bergmann Memorial Award from the U.S.-Israel Binational Science Foundation.

    “I am honored by this award,” says Mossel. “It makes me realize how fortunate I’ve been, working with creative and generous colleagues, and mentoring brilliant young minds.”

    Rosalind Picard

    Picard is a scientist, engineer, author, and professor of media arts and sciences at the MIT Media Lab. She is recognized as the founder of the field of affective computing, and has carried this research forward as head of the Media Lab’s Affective Computing research group. She is also a founding faculty chair of MIT’s MindHandHeart Initiative, and a faculty member of the MIT Center for Neurobiological Engineering. Picard is an IEEE fellow, and a member of the National Academy of Engineering. 

    Picard’s inventions are in use by thousands of research teams worldwide as well as in numerous products and services. She has co-founded two companies: Affectiva (now part of Smart Eye), providing emotion AI technologies now used by more than 25 percent of the Global Fortune 500, and Empatica, providing wearable sensors and analytics to improve health. Starting from inventions by Picard and her team, Empatica created the first AI-based smart watch cleared by the FDA (in neurology for monitoring seizures), which is now helping to bring potentially lifesaving help for people with epilepsy. 

    “This award makes me think of how blessed I am to work with so many amazing people here at MIT, especially at the Media Lab,” Picard notes. “Whenever any one of us has our contributions recognized, it is also a recognition of how special a place this is.” More

  • in

    The promise and pitfalls of artificial intelligence explored at TEDxMIT event

    Scientists, students, and community members came together last month to discuss the promise and pitfalls of artificial intelligence at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) for the fourth TEDxMIT event held at MIT. 

    Attendees were entertained and challenged as they explored “the good and bad of computing,” explained CSAIL Director Professor Daniela Rus, who organized the event with John Werner, an MIT fellow and managing director of Link Ventures; MIT sophomore Lucy Zhao; and grad student Jessica Karaguesian. “As you listen to the talks today,” Rus told the audience, “consider how our world is made better by AI, and also our intrinsic responsibilities for ensuring that the technology is deployed for the greater good.”

    Rus mentioned some new capabilities that could be enabled by AI: an automated personal assistant that could monitor your sleep phases and wake you at the optimal time, as well as on-body sensors that monitor everything from your posture to your digestive system. “Intelligent assistance can help empower and augment our lives. But these intriguing possibilities should only be pursued if we can simultaneously resolve the challenges that these technologies bring,” said Rus. 

    The next speaker, CSAIL principal investigator and professor of electrical engineering and computer science Manolis Kellis, started off by suggesting what sounded like an unattainable goal — using AI to “put an end to evolution as we know it.” Looking at it from a computer science perspective, he said, what we call evolution is basically a brute force search. “You’re just exploring all of the search space, creating billions of copies of every one of your programs, and just letting them fight against each other. This is just brutal. And it’s also completely slow. It took us billions of years to get here.” Might it be possible, he asked, to speed up evolution and make it less messy?

    The answer, Kellis said, is that we can do better, and that we’re already doing better: “We’re not killing people like Sparta used to, throwing the weaklings off the mountain. We are truly saving diversity.”

    Knowledge, moreover, is now being widely shared, passed on “horizontally” through accessible information sources, he noted, rather than “vertically,” from parent to offspring. “I would like to argue that competition in the human species has been replaced by collaboration. Despite having a fixed cognitive hardware, we have software upgrades that are enabled by culture, by the 20 years that our children spend in school to fill their brains with everything that humanity has learned, regardless of which family came up with it. This is the secret of our great acceleration” — the fact that human advancement in recent centuries has vastly out-clipped evolution’s sluggish pace.

    The next step, Kellis said, is to harness insights about evolution in order to combat an individual’s genetic susceptibility to disease. “Our current approach is simply insufficient,” he added. “We’re treating manifestations of disease, not the causes of disease.” A key element in his lab’s ambitious strategy to transform medicine is to identify “the causal pathways through which genetic predisposition manifests. It’s only by understanding these pathways that we can truly manipulate disease causation and reverse the disease circuitry.” 

    Kellis was followed by Aleksander Madry, MIT professor of electrical engineering and computer science and CSAIL principal investigator, who told the crowd, “progress in AI is happening, and it’s happening fast.” Computer programs can routinely beat humans in games like chess, poker, and Go. So should we be worried about AI surpassing humans? 

    Madry, for one, is not afraid — or at least not yet. And some of that reassurance stems from research that has led him to the following conclusion: Despite its considerable success, AI, especially in the form of machine learning, is lazy. “Think about being lazy as this kind of smart student who doesn’t really want to study for an exam. Instead, what he does is just study all the past years’ exams and just look for patterns. Instead of trying to actually learn, he just tries to pass the test. And this is exactly the same way in which current AI is lazy.”

    A machine-learning model might recognize grazing sheep, for instance, simply by picking out pictures that have green grass in them. If a model is trained to identify fish from photos of anglers proudly displaying their catches, Madry explained, “the model figures out that if there’s a human holding something in the picture, I will just classify it as a fish.” The consequences can be more serious for an AI model intended to pick out malignant tumors. If the model is trained on images containing rulers that indicate the size of tumors, the model may end up selecting only those photos that have rulers in them.

    This leads to Madry’s biggest concerns about AI in its present form. “AI is beating us now,” he noted. “But the way it does it [involves] a little bit of cheating.” He fears that we will apply AI “in some way in which this mismatch between what the model actually does versus what we think it does will have some catastrophic consequences.” People relying on AI, especially in potentially life-or-death situations, need to be much more mindful of its current limitations, Madry cautioned.

    There were 10 speakers altogether, and the last to take the stage was MIT associate professor of electrical engineering and computer science and CSAIL principal investigator Marzyeh Ghassemi, who laid out her vision for how AI could best contribute to general health and well-being. But in order for that to happen, its models must be trained on accurate, diverse, and unbiased medical data.

    It’s important to focus on the data, Ghassemi stressed, because these models are learning from us. “Since our data is human-generated … a neural network is learning how to practice from a doctor. But doctors are human, and humans make mistakes. And if a human makes a mistake, and we train an AI from that, the AI will, too. Garbage in, garbage out. But it’s not like the garbage is distributed equally.”

    She pointed out that many subgroups receive worse care from medical practitioners, and members of these subgroups die from certain conditions at disproportionately high rates. This is an area, Ghassemi said, “where AI can actually help. This is something we can fix.” Her group is developing machine-learning models that are robust, private, and fair. What’s holding them back is neither algorithms nor GPUs. It’s data. Once we collect reliable data from diverse sources, Ghassemi added, we might start reaping the benefits that AI can bring to the realm of health care.

    In addition to CSAIL speakers, there were talks from members across MIT’s Institute for Data, Systems, and Society; the MIT Mobility Initiative; the MIT Media Lab; and the SENSEable City Lab.

    The proceedings concluded on that hopeful note. Rus and Werner then thanked everyone for coming. “Please continue to reflect about the good and bad of computing,” Rus urged. “And we look forward to seeing you back here in May for the next TEDxMIT event.”

    The exact theme of the spring 2022 gathering will have something to do with “superpowers.” But — if December’s mind-bending presentations were any indication — the May offering is almost certain to give its attendees plenty to think about. And maybe provide the inspiration for a startup or two. More