More stories

  • in

    Devavrat Shah appointed faculty director of the Deshpande Center

    Devavrat Shah, the Andrew (1956) and Erna Viterbi Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, has been named faculty director of the MIT Deshpande Center for Technological Innovations. The new role took effect on Feb. 1.

    Shah replaces Tim Swager, the John D. MacArthur Professor of Chemistry, who has held the position of faculty director since 2014. Working alongside Executive Director to the Deshpande Center Leon Sandler, Swager helped the Deshpande Center build an inclusive environment where innovation and entrepreneurship could thrive. By examining new models for directing, seeding, and fostering the commercialization of inventions and technology, Swager helped students and faculty breathe life into research, propelling it out of the lab and into the world as successful ventures.

    The MIT Deshpande Center for Technological Innovations is an interdepartmental center working to empower MIT’s most talented students and faculty by helping them bring new innovative technologies from the lab to the marketplace in the form of breakthrough products and new companies. Desh Deshpande founded the center with his wife, in 2002.

    “Professor Shah’s deep entrepreneurial experience coupled with his research on large complex networks will be tremendous assets to the center,” says Deshpande. “Devavrat is an impactful educator and inspiring mentor who will play a key role in the center’s mission to foster innovation and accelerate the impact of new discoveries.”

    Shah joined the Department of Electrical Engineering and Computer Science in 2005. With research focusing on statistical inference and stochastic networks, his research contributions span a variety of areas including resource allocation in communications networks, inference and learning on graphical models, algorithms for social data processing including ranking, recommendations and crowdsourcing, and more recently, causal inference using observational and experimental data.  

    While Shah’s work spans a range of areas across electrical engineering, computer science, and operations research, they are all tied together with the singular focus on developing algorithmic solutions for practical, challenging problems. He’s also authored two books, one on gossip algorithms in 2006 and the other on prediction methods of nearest neighbors in 2018. 

    A highly regarded teacher, Shah has been very active in curriculum development — most notably class 6.438 (Algorithms for Inference) and class 6.401 (Introduction to Statistical Data Analysis) — and has taken a leading role in developing educational programs in the statistics and data science at MIT as part of the Statistics and Data Science Center within the Institute for Data, Systems, and Society.

    “With his experience and contributions as a researcher, educator, and innovator, I have no doubt that Devavrat will excel as the next faculty director of the Deshpande Center and help usher in the next era of innovation for MIT,” says Anantha P. Chandrakasan, dean of the School of Engineering and Vannevar Bush Professor of Electrical Engineering and Computer Science. “I am grateful to Tim for the tremendous work he has done during his eight years as faculty director of the Deshpande Center. His commitment to building an inclusive environment for innovation and entrepreneurship to thrive was particularly impressive.” 

    A practiced entrepreneur, Shah co-founded Celect, Inc. — now part of Nike — in 2013, to help retailers accurately predict demand using omnichannel data. In 2019, he helped start IkigaiLabs, where he serves as CTO, with the mission to build self-driving organizations by enabling data-driven operations with human-in-the-loop with the ease of spreadsheet.

    Among his many achievements and accolades, Shah was named a Kavli Fellow of the National Academy of Science in 2014 and was just recently announced as an Institute of Electrical and Electronics Engineers (IEEE) Fellow for 2022. He’s also received a number of awards for his papers from INFORMS Applied Probability Society, INFORMS Management Science and Operations Management, NeurIPS, ACM Sigmetrics, and IEEE Infocom. His career prizes include the Erlang Prize from INFORMS Applied Probability Society and the Rising Star Award from ACM Sigmetrics. Shah has also received multiple Test of Time paper awards from ACM Sigmetrics and is recognized as a distinguished alumnus of his alma mater, the Indian Institute of Technology Bombay.

    “The Deshpande Center thanks Tim for his years of service as faculty director,” says the center’s executive director, Leon Sandler. “Tim’s commitment to innovation played an integral role in our success, and the center’s programs have thrived under his leadership. I look forward to working with Devavrat in the continuing effort to fulfill the mission of our center.”

    As part of his new post, Shah will work closely with Sandler, who has held the executive director position at the Deshpande Center since 2006. More

  • in

    Unlocking new doors to artificial intelligence

    Artificial intelligence research is constantly developing new hypotheses that have the potential to benefit society and industry; however, sometimes these benefits are not fully realized due to a lack of engineering tools. To help bridge this gap, graduate students in the MIT Department of Electrical Engineering and Computer Science’s 6-A Master of Engineering (MEng) Thesis Program work with some of the most innovative companies in the world and collaborate on cutting-edge projects, while contributing to and completing their MEng thesis.

    During a portion of the last year, four 6-A MEng students teamed up and completed an internship with IBM Research’s advanced prototyping team through the MIT-IBM Watson AI Lab on AI projects, often developing web applications to solve a real-world issue or business use cases. Here, the students worked alongside AI engineers, user experience engineers, full-stack researchers, and generalists to accommodate project requests and receive thesis advice, says Lee Martie, IBM research staff member and 6-A manager. The students’ projects ranged from generating synthetic data to allow for privacy-sensitive data analysis to using computer vision to identify actions in video that allows for monitoring human safety and tracking build progress on a construction site.

    “I appreciated all of the expertise from the team and the feedback,” says 6-A graduate Violetta Jusiega ’21, who participated in the program. “I think that working in industry gives the lens of making sure that the project’s needs are satisfied and [provides the opportunity] to ground research and make sure that it is helpful for some use case in the future.”

    Jusiega’s research intersected the fields of computer vision and design to focus on data visualization and user interfaces for the medical field. Working with IBM, she built an application programming interface (API) that let clinicians interact with a medical treatment strategy AI model, which was deployed in the cloud. Her interface provided a medical decision tree, as well as some prescribed treatment plans. After receiving feedback on her design from physicians at a local hospital, Jusiega developed iterations of the API and how the results where displayed, visually, so that it would be user-friendly and understandable for clinicians, who don’t usually code. She says that, “these tools are often not acquired into the field because they lack some of these API principles which become more important in an industry where everything is already very fast paced, so there’s little time to incorporate a new technology.” But this project might eventually allow for industry deployment. “I think this application has a bunch of potential, whether it does get picked up by clinicians or whether it’s simply used in research. It’s very promising and very exciting to see how technology can help us modify, or I can improve, the health-care field to be even more custom-tailored towards patients and giving them the best care possible,” she says.

    Another 6-A graduate student, Spencer Compton, was also considering aiding professionals to make more informed decisions, for use in settings including health care, but he was tackling it from a causal perspective. When given a set of related variables, Compton was investigating if there was a way to determine not just correlation, but the cause-and-effect relationship between them (the direction of the interaction) from the data alone. For this, he and his collaborators from IBM Research and Purdue University turned to a field of math called information theory. With the goal of designing an algorithm to learn complex networks of causal relationships, Compton used ideas relating to entropy, the randomness in a system, to help determine if a causal relationship is present and how variables might be interacting. “When judging an explanation, people often default to Occam’s razor” says Compton. “We’re more inclined to believe a simpler explanation than a more complex one.” In many cases, he says, it seemed to perform well. For instance, they were able to consider variables such as lung cancer, pollution, and X-ray findings. He was pleased that his research allowed him to help create a framework of “entropic causal inference” that could aid in safe and smart decisions in the future, in a satisfying way. “The math is really surprisingly deep, interesting, and complex,” says Compton. “We’re basically asking, ‘when is the simplest explanation correct?’ but as a math question.”

    Determining relationships within data can sometimes require large volumes of it to suss out patterns, but for data that may contain sensitive information, this may not be available. For her master’s work, Ivy Huang worked with IBM Research to generate synthetic tabular data using a natural language processing tool called a transformer model, which can learn and predict future values from past values. Trained on real data, the model can produce new data with similar patterns, properties, and relationships without restrictions like privacy, availability, and access that might come with real data in financial transactions and electronic medical records. Further, she created an API and deployed the model in an IBM cluster, which allowed users increased access to the model and abilities to query it without compromising the original data.

    Working with the advanced prototyping team, MEng candidate Brandon Perez also considered how to gather and investigate data with restrictions, but in his case it was to use computer vision frameworks, centered on an action recognition model, to identify construction site happenings. The team based their work on the Moments in Time dataset, which contains over a million three-second video clips with about 300 attached classification labels, and has performed well during AI training. However, the group needed more construction-based video data. For this, they used YouTube-8M. Perez built a framework for testing and fine-tuning existing object detection models and action recognition models that could plug into an automatic spatial and temporal localization tool — how they would identify and label particular actions in a video timeline. “I was satisfied that I was able to explore what made me curious, and I was grateful for the autonomy that I was given with this project,” says Perez. “I felt like I was always supported, and my mentor was a great support to the project.”

    “The kind of collaborations that we have seen between our MEng students and IBM researchers are exactly what the 6-A MEng Thesis program at MIT is all about,” says Tomas Palacios, professor of electrical engineering and faculty director of the MIT 6-A MEng Thesis program. “For more than 100 years, 6-A has been connecting MIT students with industry to solve together some of the most important problems in the world.” More

  • in

    Injecting fairness into machine-learning models

    If a machine-learning model is trained using an unbalanced dataset, such as one that contains far more images of people with lighter skin than people with darker skin, there is serious risk the model’s predictions will be unfair when it is deployed in the real world.

    But this is only one part of the problem. MIT researchers have found that machine-learning models that are popular for image recognition tasks actually encode bias when trained on unbalanced data. This bias within the model is impossible to fix later on, even with state-of-the-art fairness-boosting techniques, and even when retraining the model with a balanced dataset.      

    So, the researchers came up with a technique to introduce fairness directly into the model’s internal representation itself. This enables the model to produce fair outputs even if it is trained on unfair data, which is especially important because there are very few well-balanced datasets for machine learning.

    The solution they developed not only leads to models that make more balanced predictions, but also improves their performance on downstream tasks like facial recognition and animal species classification.

    “In machine learning, it is common to blame the data for bias in models. But we don’t always have balanced data. So, we need to come up with methods that actually fix the problem with imbalanced data,” says lead author Natalie Dullerud, a graduate student in the Healthy ML Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT.

    Dullerud’s co-authors include Kimia Hamidieh, a graduate student in the Healthy ML Group; Karsten Roth, a former visiting researcher who is now a graduate student at the University of Tubingen; Nicolas Papernot, an assistant professor in the University of Toronto’s Department of Electrical Engineering and Computer Science; and senior author Marzyeh Ghassemi, an assistant professor and head of the Healthy ML Group. The research will be presented at the International Conference on Learning Representations.

    Defining fairness

    The machine-learning technique the researchers studied is known as deep metric learning, which is a broad form of representation learning. In deep metric learning, a neural network learns the similarity between objects by mapping similar photos close together and dissimilar photos far apart. During training, this neural network maps images in an “embedding space” where a similarity metric between photos corresponds to the distance between them.

    For example, if a deep metric learning model is being used to classify bird species, it will map photos of golden finches together in one part of the embedding space and cardinals together in another part of the embedding space. Once trained, the model can effectively measure the similarity of new images it hasn’t seen before. It would learn to cluster images of an unseen bird species close together, but farther from cardinals or golden finches within the embedding space.

    The similarity metrics the model learns are very robust, which is why deep metric learning is so often employed for facial recognition, Dullerud says. But she and her colleagues wondered how to determine if a similarity metric is biased.

    “We know that data reflect the biases of processes in society. This means we have to shift our focus to designing methods that are better suited to reality,” says Ghassemi.

    The researchers defined two ways that a similarity metric can be unfair. Using the example of facial recognition, the metric will be unfair if it is more likely to embed individuals with darker-skinned faces closer to each other, even if they are not the same person, than it would if those images were people with lighter-skinned faces. Second, it will be unfair if the features it learns for measuring similarity are better for the majority group than for the minority group.

    The researchers ran a number of experiments on models with unfair similarity metrics and were unable to overcome the bias the model had learned in its embedding space.

    “This is quite scary because it is a very common practice for companies to release these embedding models and then people finetune them for some downstream classification task. But no matter what you do downstream, you simply can’t fix the fairness problems that were induced in the embedding space,” Dullerud says.

    Even if a user retrains the model on a balanced dataset for the downstream task, which is the best-case scenario for fixing the fairness problem, there are still performance gaps of at least 20 percent, she says.

    The only way to solve this problem is to ensure the embedding space is fair to begin with.

    Learning separate metrics

    The researchers’ solution, called Partial Attribute Decorrelation (PARADE), involves training the model to learn a separate similarity metric for a sensitive attribute, like skin tone, and then decorrelating the skin tone similarity metric from the targeted similarity metric. If the model is learning the similarity metrics of different human faces, it will learn to map similar faces close together and dissimilar faces far apart using features other than skin tone.

    Any number of sensitive attributes can be decorrelated from the targeted similarity metric in this way. And because the similarity metric for the sensitive attribute is learned in a separate embedding space, it is discarded after training so only the targeted similarity metric remains in the model.

    Their method is applicable to many situations because the user can control the amount of decorrelation between similarity metrics. For instance, if the model will be diagnosing breast cancer from mammogram images, a clinician likely wants some information about biological sex to remain in the final embedding space because it is much more likely that women will have breast cancer than men, Dullerud explains.

    They tested their method on two tasks, facial recognition and classifying bird species, and found that it reduced performance gaps caused by bias, both in the embedding space and in the downstream task, regardless of the dataset they used.

    Moving forward, Dullerud is interested in studying how to force a deep metric learning model to learn good features in the first place.

    “How do you properly audit fairness? That is an open question right now. How can you tell that a model is going to be fair, or that it is only going to be fair in certain situations, and what are those situations? Those are questions I am really interested in moving forward,” she says. More

  • in

    Deep-learning technique predicts clinical treatment outcomes

    When it comes to treatment strategies for critically ill patients, clinicians want to be able to consider all their options and timing of administration, and make the optimal decision for their patients. While clinician experience and study has helped them to be successful in this effort, not all patients are the same, and treatment decisions at this crucial time could mean the difference between patient improvement and quick deterioration. Therefore, it would be helpful for doctors to be able to take a patient’s previous known health status and received treatments and use that to predict that patient’s health outcome under different treatment scenarios, in order to pick the best path.

    Now, a deep-learning technique, called G-Net, from researchers at MIT and IBM provides a window into causal counterfactual prediction, affording physicians the opportunity to explore how a patient might fare under different treatment plans. The foundation of G-Net is the g-computation algorithm, a causal inference method that estimates the effect of dynamic exposures in the presence of measured confounding variables — ones that may influence both treatments and outcomes. Unlike previous implementations of the g-computation framework, which have used linear modeling approaches, G-Net uses recurrent neural networks (RNN), which have node connections that allow them to better model temporal sequences with complex and nonlinear dynamics, like those found in the physiological and clinical time series data. In this way, physicians can develop alternative plans based on patient history and test them before making a decision.

    “Our ultimate goal is to develop a machine learning technique that would allow doctors to explore various ‘What if’ scenarios and treatment options,” says Li-wei Lehman, MIT research scientist in the MIT Institute for Medical Engineering and Science and an MIT-IBM Watson AI Lab project lead. “A lot of work has been done in terms of deep learning for counterfactual prediction but [it’s] been focusing on a point exposure setting,” or a static, time-varying treatment strategy, which doesn’t allow for adjustment of treatments as patient history changes. However, her team’s new prediction approach provides for treatment plan flexibility and chances for treatment alteration over time as patient covariate history and past treatments change. “G-Net is the first deep-learning approach based on g-computation that can predict both the population-level and individual-level treatment effects under dynamic and time varying treatment strategies.”

    The research, which was recently published in the Proceedings of Machine Learning Research, was co-authored by Rui Li MEng ’20, Stephanie Hu MEng ’21, former MIT postdoc Mingyu Lu MD, graduate student Yuria Utsumi, IBM research staff member Prithwish Chakraborty, IBM Research director of Hybrid Cloud Services Daby Sow, IBM data scientist Piyush Madan, IBM research scientist Mohamed Ghalwash, and IBM research scientist Zach Shahn.

    Tracking disease progression

    To build, validate, and test G-Net’s predictive abilities, the researchers considered the circulatory system in septic patients in the ICU. During critical care, doctors need to make trade-offs and judgement calls, such as ensuring the organs are receiving adequate blood supply without overworking the heart. For this, they could give intravenous fluids to patients to increase blood pressure; however, too much can cause edema. Alternatively, physicians can administer vasopressors, which act to contract blood vessels and raise blood pressure.

    In order to mimic this and demonstrate G-Net’s proof-of-concept, the team used CVSim, a mechanistic model of a human cardiovascular system that’s governed by 28 input variables characterizing the system’s current state, such as arterial pressure, central venous pressure, total blood volume, and total peripheral resistance, and modified it to simulate various disease processes (e.g., sepsis or blood loss) and effects of interventions (e.g., fluids and vasopressors). The researchers used CVSim to generate observational patient data for training and for “ground truth” comparison against counterfactual prediction. In their G-Net architecture, the researchers ran two RNNs to handle and predict variables that are continuous, meaning they can take on a range of values, like blood pressure, and categorical variables, which have discrete values, like the presence or absence of pulmonary edema. The researchers simulated the health trajectories of thousands of “patients” exhibiting symptoms under one treatment regime, let’s say A, for 66 timesteps, and used them to train and validate their model.

    Testing G-Net’s prediction capability, the team generated two counterfactual datasets. Each contained roughly 1,000 known patient health trajectories, which were created from CVSim using the same “patient” condition as the starting point under treatment A. Then at timestep 33, treatment changed to plan B or C, depending on the dataset. The team then performed 100 prediction trajectories for each of these 1,000 patients, whose treatment and medical history was known up until timestep 33 when a new treatment was administered. In these cases, the prediction agreed well with the “ground-truth” observations for individual patients and averaged population-level trajectories.

    A cut above the rest

    Since the g-computation framework is flexible, the researchers wanted to examine G-Net’s prediction using different nonlinear models — in this case, long short-term memory (LSTM) models, which are a type of RNN that can learn from previous data patterns or sequences — against the more classical linear models and a multilayer perception model (MLP), a type of neural network that can make predictions using a nonlinear approach. Following a similar setup as before, the team found that the error between the known and predicted cases was smallest in the LSTM models compared to the others. Since G-Net is able to model the temporal patterns of the patient’s ICU history and past treatment, whereas a linear model and MLP cannot, it was better able to predict the patient’s outcome.

    The team also compared G-Net’s prediction in a static, time-varying treatment setting against two state-of-the-art deep-learning based counterfactual prediction approaches, a recurrent marginal structural network (rMSN) and a counterfactual recurrent neural network (CRN), as well as a linear model and an MLP. For this, they investigated a model for tumor growth under no treatment, radiation, chemotherapy, and both radiation and chemotherapy scenarios. “Imagine a scenario where there’s a patient with cancer, and an example of a static regime would be if you only give a fixed dosage of chemotherapy, radiation, or any kind of drug, and wait until the end of your trajectory,” comments Lu. For these investigations, the researchers generated simulated observational data using tumor volume as the primary influence dictating treatment plans and demonstrated that G-Net outperformed the other models. One potential reason could be because g-computation is known to be more statistically efficient than rMSN and CRN, when models are correctly specified.

    While G-Net has done well with simulated data, more needs to be done before it can be applied to real patients. Since neural networks can be thought of as “black boxes” for prediction results, the researchers are beginning to investigate the uncertainty in the model to help ensure safety. In contrast to these approaches that recommend an “optimal” treatment plan without any clinician involvement, “as a decision support tool, I believe that G-Net would be more interpretable, since the clinicians would input treatment strategies themselves,” says Lehman, and “G-Net will allow them to be able to explore different hypotheses.” Further, the team has moved on to using real data from ICU patients with sepsis, bringing it one step closer to implementation in hospitals.

    “I think it is pretty important and exciting for real-world applications,” says Hu. “It’d be helpful to have some way to predict whether or not a treatment might work or what the effects might be — a quicker iteration process for developing these hypotheses for what to try, before actually trying to implement them in in a years-long, potentially very involved and very invasive type of clinical trial.”

    This research was funded by the MIT-IBM Watson AI Lab. More

  • in

    Can machine-learning models overcome biased datasets?

    Artificial intelligence systems may be able to complete tasks quickly, but that doesn’t mean they always do so fairly. If the datasets used to train machine-learning models contain biased data, it is likely the system could exhibit that same bias when it makes decisions in practice.

    For instance, if a dataset contains mostly images of white men, then a facial-recognition model trained with these data may be less accurate for women or people with different skin tones.

    A group of researchers at MIT, in collaboration with researchers at Harvard University and Fujitsu Ltd., sought to understand when and how a machine-learning model is capable of overcoming this kind of dataset bias. They used an approach from neuroscience to study how training data affects whether an artificial neural network can learn to recognize objects it has not seen before. A neural network is a machine-learning model that mimics the human brain in the way it contains layers of interconnected nodes, or “neurons,” that process data.

    The new results show that diversity in training data has a major influence on whether a neural network is able to overcome bias, but at the same time dataset diversity can degrade the network’s performance. They also show that how a neural network is trained, and the specific types of neurons that emerge during the training process, can play a major role in whether it is able to overcome a biased dataset.

    “A neural network can overcome dataset bias, which is encouraging. But the main takeaway here is that we need to take into account data diversity. We need to stop thinking that if you just collect a ton of raw data, that is going to get you somewhere. We need to be very careful about how we design datasets in the first place,” says Xavier Boix, a research scientist in the Department of Brain and Cognitive Sciences (BCS) and the Center for Brains, Minds, and Machines (CBMM), and senior author of the paper.  

    Co-authors include former MIT graduate students Timothy Henry, Jamell Dozier, Helen Ho, Nishchal Bhandari, and Spandan Madan, a corresponding author who is currently pursuing a PhD at Harvard; Tomotake Sasaki, a former visiting scientist now a senior researcher at Fujitsu Research; Frédo Durand, a professor of electrical engineering and computer science at MIT and a member of the Computer Science and Artificial Intelligence Laboratory; and Hanspeter Pfister, the An Wang Professor of Computer Science at the Harvard School of Enginering and Applied Sciences. The research appears today in Nature Machine Intelligence.

    Thinking like a neuroscientist

    Boix and his colleagues approached the problem of dataset bias by thinking like neuroscientists. In neuroscience, Boix explains, it is common to use controlled datasets in experiments, meaning a dataset in which the researchers know as much as possible about the information it contains.

    The team built datasets that contained images of different objects in varied poses, and carefully controlled the combinations so some datasets had more diversity than others. In this case, a dataset had less diversity if it contains more images that show objects from only one viewpoint. A more diverse dataset had more images showing objects from multiple viewpoints. Each dataset contained the same number of images.

    The researchers used these carefully constructed datasets to train a neural network for image classification, and then studied how well it was able to identify objects from viewpoints the network did not see during training (known as an out-of-distribution combination). 

    For example, if researchers are training a model to classify cars in images, they want the model to learn what different cars look like. But if every Ford Thunderbird in the training dataset is shown from the front, when the trained model is given an image of a Ford Thunderbird shot from the side, it may misclassify it, even if it was trained on millions of car photos.

    The researchers found that if the dataset is more diverse — if more images show objects from different viewpoints — the network is better able to generalize to new images or viewpoints. Data diversity is key to overcoming bias, Boix says.

    “But it is not like more data diversity is always better; there is a tension here. When the neural network gets better at recognizing new things it hasn’t seen, then it will become harder for it to recognize things it has already seen,” he says.

    Testing training methods

    The researchers also studied methods for training the neural network.

    In machine learning, it is common to train a network to perform multiple tasks at the same time. The idea is that if a relationship exists between the tasks, the network will learn to perform each one better if it learns them together.

    But the researchers found the opposite to be true — a model trained separately for each task was able to overcome bias far better than a model trained for both tasks together.

    “The results were really striking. In fact, the first time we did this experiment, we thought it was a bug. It took us several weeks to realize it was a real result because it was so unexpected,” he says.

    They dove deeper inside the neural networks to understand why this occurs.

    They found that neuron specialization seems to play a major role. When the neural network is trained to recognize objects in images, it appears that two types of neurons emerge — one that specializes in recognizing the object category and another that specializes in recognizing the viewpoint.

    When the network is trained to perform tasks separately, those specialized neurons are more prominent, Boix explains. But if a network is trained to do both tasks simultaneously, some neurons become diluted and don’t specialize for one task. These unspecialized neurons are more likely to get confused, he says.

    “But the next question now is, how did these neurons get there? You train the neural network and they emerge from the learning process. No one told the network to include these types of neurons in its architecture. That is the fascinating thing,” he says.

    That is one area the researchers hope to explore with future work. They want to see if they can force a neural network to develop neurons with this specialization. They also want to apply their approach to more complex tasks, such as objects with complicated textures or varied illuminations.

    Boix is encouraged that a neural network can learn to overcome bias, and he is hopeful their work can inspire others to be more thoughtful about the datasets they are using in AI applications.

    This work was supported, in part, by the National Science Foundation, a Google Faculty Research Award, the Toyota Research Institute, the Center for Brains, Minds, and Machines, Fujitsu Research, and the MIT-Sensetime Alliance on Artificial Intelligence. More

  • in

    Professor Emery Brown has big plans for anesthesiology

    Emery N. Brown — the Edward Hood Taplin Professor of Medical Engineering and of Computational Neuroscience at MIT, an MIT professor of health sciences and technology, an investigator with The Picower Institute for Learning and Memory at MIT, and the Warren M. Zapol Professor of Anaesthesia at Harvard Medical School and Massachusetts General Hospital (MGH) — clearly excels at many roles. Renowned internationally for his anesthesia and neuroscience research, he embodies a unique blend of anesthesiologist, statistician, neuroscientist, educator, and mentor to both students and colleagues. Notably, Brown is one of the most decorated clinician-scientists in the country; he is one of only 25 people — and the first African-American, statistician, and anesthesiologist — to be elected to all three National Academies (Science, Engineering, and Medicine).

    Now, he is handing off one of his many key roles and responsibilities. After almost 10 years, Brown is stepping down as co-director of the Harvard-MIT Program in Health Sciences and Technology (HST). He will turn his energies toward working to develop a new joint center between MIT and MGH that uses the study of anesthesia to design novel approaches to controlling brain states. While a goal of the new center will be to improve anesthesia and intensive care unit management, according to Brown, it will also study related problems such as treating depression, insomnia, and epilepsy, as well as enhancing coma recovery.

    Founded in 1970, HST is one of the oldest interdisciplinary educational programs focused on training the next generation of clinician-scientists and engineers, who learn to translate science, engineering, and medical research into clinical practice, with the aim of improving human health. The MIT Institute for Medical Engineering and Science (IMES), where Brown is associate director, is HST’s home at MIT. Brown was the first HST co-director after the establishment of IMES in 2012; Wolfram Goessling is the Harvard University co-director of HST.

    “Emery has been an exemplary leader for HST during his tenure, and has helped it become a hub for the training of world-class scientists, engineers, and clinicians,” says Anantha Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “I am deeply grateful for his many years of service and wish him well as he moves on to new endeavors.”

    Elazer R. Edelman, director of IMES, calls Brown “a phenom who has been dedicated to our programs for years.”

    “With his thoughtful leadership and understated style, Emery made many contributions to the HST community,” Edelman continues. “On a personal note, this is bittersweet for me, as Emery has been a partner and mentor in my role as IMES director. And while I know that he will always be there for me, as he has been for all of us at IMES and HST, I will miss our late-night calls and midday conferences on matters of import for MIT, IMES, and HST.”

    Brown says “it was an honor and a privilege to co-direct HST with Wolfram.”

    “The students, staff, and faculty are simply amazing,” Brown continues. “Although, now more than 50 years old, HST remains at the vanguard for training PhD and MD students to work at the intersection between engineering, science, and medicine.”

    Goessling also thanks Brown for his leadership: “I truly valued Emery’s partnership and friendship, working together to deepen ties between the MIT and Harvard sides of HST. I am particularly grateful for working with Emery on our combined diversity efforts, leading to the HST Diversity Ambassadors initiative that made HST a better and stronger program.”

    According to Edelman, Brown was instrumental in the transition to new paradigms and relationships with HMS in the context of IMES. In 2014, he led the establishment of clear criteria for HST faculty membership, thereby strengthening the community of faculty experts who train students and provide research opportunities. More recently, he provided guidance through the turmoil of the ongoing Covid-19 pandemic, including the transition to online instruction and the return to the classroom. And Brown has always been a strong supporter of student diversity efforts, serving as an advocate and advisor to HST students.

    Brown holds BA, MA, and PhD degrees from Harvard University, and an MD from Harvard Medical School. He has been recognized with many awards, including the 2020 Swartz Prize in Theoretical and Computational Neuroscience, the 2018 Dickson Prize in Science, and an NIH Director’s Pioneer Award. Brown also served on President Barack Obama’s BRAIN Initiative Working Group. Among his many accomplishments, he has been cited for developing neural signal processing algorithms to characterize how neural systems represent and transmit information, and for unlocking the neurophysiology of how anesthetics produce the states of general anesthesia.

    Edelman says the process is underway to name a successor to Brown as co-director of HST at MIT. More

  • in

    3 Questions: What a single car can say about traffic

    Vehicle traffic has long defied description. Once measured roughly through visual inspection and traffic cameras, new smartphone crowdsourcing tools are now quantifying traffic far more precisely. This popular method, however, also presents a problem: Accurate measurements require a lot of data and users.

    Meshkat Botshekan, an MIT PhD student in civil and environmental engineering and research assistant at the MIT Concrete Sustainability Hub, has sought to expand on crowdsourcing methods by looking into the physics of traffic. During his time as a doctoral candidate, he has helped develop Carbin, a smartphone-based roadway crowdsourcing tool created by MIT CSHub and the University of Massachusetts Dartmouth, and used its data to offer more insight into the physics of traffic — from the formation of traffic jams to the inference of traffic phase and driving behavior. Here, he explains how recent findings can allow smartphones to infer traffic properties from the measurements of a single vehicle.  

    Q: Numerous navigation apps already measure traffic. Why do we need alternatives?

    A: Traffic characteristics have always been tough to measure. In the past, visual inspection and cameras were used to produce traffic metrics. So, there’s no denying that today’s navigation tools apps offer a superior alternative. Yet even these modern tools have gaps.

    Chief among them is their dependence on spatially distributed user counts: Essentially, these apps tally up their users on road segments to estimate the density of traffic. While this approach may seem adequate, it is both vulnerable to manipulation, as demonstrated in some viral videos, and requires immense quantities of data for reliable estimates. Processing these data is so time- and resource-intensive that, despite their availability, they can’t be used to quantify traffic effectively across a whole road network. As a result, this immense quantity of traffic data isn’t actually optimal for traffic management.

    Q: How could new technologies improve how we measure traffic?

    A: New alternatives have the potential to offer two improvements over existing methods: First, they can extrapolate far more about traffic with far fewer data. Second, they can cost a fraction of the price while offering a far simpler method of data collection. Just like Waze and Google Maps, they rely on crowdsourcing data from users. Yet, they are grounded in the incorporation of high-level statistical physics into data analysis.

    For instance, the Carbin app, which we are developing in collaboration with UMass Dartmouth, applies principles of statistical physics to existing traffic models to entirely forgo the need for user counts. Instead, it can infer traffic density and driver behavior using the input of a smartphone mounted in single vehicle.

    The method at the heart of the app, which was published last fall in Physical Review E, treats vehicles like particles in a many-body system. Just as the behavior of a closed many-body system can be understood through observing the behavior of an individual particle relying on the ergodic theorem of statistical physics, we can characterize traffic through the fluctuations in speed and position of a single vehicle across a road. As a result, we can infer the behavior and density of traffic on a segment of a road.

    As far less data is required, this method is more rapid and makes data management more manageable. But most importantly, it also has the potential to make traffic data less expensive and accessible to those that need it.

    Q: Who are some of the parties that would benefit from new technologies?

    A: More accessible and sophisticated traffic data would benefit more than just drivers seeking smoother, faster routes. It would also enable state and city departments of transportation (DOTs) to make local and collective interventions that advance the critical transportation objectives of equity, safety, and sustainability.

    As a safety solution, new data collection technologies could pinpoint dangerous driving conditions on a much finer scale to inform improved traffic calming measures. And since socially vulnerable communities experience traffic violence disproportionately, these interventions would have the added benefit of addressing pressing equity concerns. 

    There would also be an environmental benefit. DOTs could mitigate vehicle emissions by identifying minute deviations in traffic flow. This would present them with more opportunities to mitigate the idling and congestion that generate excess fuel consumption.  

    As we’ve seen, these three challenges have become increasingly acute, especially in urban areas. Yet, the data needed to address them exists already — and is being gathered by smartphones and telematics devices all over the world. So, to ensure a safer, more sustainable road network, it will be crucial to incorporate these data collection methods into our decision-making. More

  • in

    The downside of machine learning in health care

    While working toward her dissertation in computer science at MIT, Marzyeh Ghassemi wrote several papers on how machine-learning techniques from artificial intelligence could be applied to clinical data in order to predict patient outcomes. “It wasn’t until the end of my PhD work that one of my committee members asked: ‘Did you ever check to see how well your model worked across different groups of people?’”

    That question was eye-opening for Ghassemi, who had previously assessed the performance of models in aggregate, across all patients. Upon a closer look, she saw that models often worked differently — specifically worse — for populations including Black women, a revelation that took her by surprise. “I hadn’t made the connection beforehand that health disparities would translate directly to model disparities,” she says. “And given that I am a visible minority woman-identifying computer scientist at MIT, I am reasonably certain that many others weren’t aware of this either.”

    In a paper published Jan. 14 in the journal Patterns, Ghassemi — who earned her doctorate in 2017 and is now an assistant professor in the Department of Electrical Engineering and Computer Science and the MIT Institute for Medical Engineering and Science (IMES) — and her coauthor, Elaine Okanyene Nsoesie of Boston University, offer a cautionary note about the prospects for AI in medicine. “If used carefully, this technology could improve performance in health care and potentially reduce inequities,” Ghassemi says. “But if we’re not actually careful, technology could worsen care.”

    It all comes down to data, given that the AI tools in question train themselves by processing and analyzing vast quantities of data. But the data they are given are produced by humans, who are fallible and whose judgments may be clouded by the fact that they interact differently with patients depending on their age, gender, and race, without even knowing it.

    Furthermore, there is still great uncertainty about medical conditions themselves. “Doctors trained at the same medical school for 10 years can, and often do, disagree about a patient’s diagnosis,” Ghassemi says. That’s different from the applications where existing machine-learning algorithms excel — like object-recognition tasks — because practically everyone in the world will agree that a dog is, in fact, a dog.

    Machine-learning algorithms have also fared well in mastering games like chess and Go, where both the rules and the “win conditions” are clearly defined. Physicians, however, don’t always concur on the rules for treating patients, and even the win condition of being “healthy” is not widely agreed upon. “Doctors know what it means to be sick,” Ghassemi explains, “and we have the most data for people when they are sickest. But we don’t get much data from people when they are healthy because they’re less likely to see doctors then.”

    Even mechanical devices can contribute to flawed data and disparities in treatment. Pulse oximeters, for example, which have been calibrated predominately on light-skinned individuals, do not accurately measure blood oxygen levels for people with darker skin. And these deficiencies are most acute when oxygen levels are low — precisely when accurate readings are most urgent. Similarly, women face increased risks during “metal-on-metal” hip replacements, Ghassemi and Nsoesie write, “due in part to anatomic differences that aren’t taken into account in implant design.” Facts like these could be buried within the data fed to computer models whose output will be undermined as a result.

    Coming from computers, the product of machine-learning algorithms offers “the sheen of objectivity,” according to Ghassemi. But that can be deceptive and dangerous, because it’s harder to ferret out the faulty data supplied en masse to a computer than it is to discount the recommendations of a single possibly inept (and maybe even racist) doctor. “The problem is not machine learning itself,” she insists. “It’s people. Human caregivers generate bad data sometimes because they are not perfect.”

    Nevertheless, she still believes that machine learning can offer benefits in health care in terms of more efficient and fairer recommendations and practices. One key to realizing the promise of machine learning in health care is to improve the quality of data, which is no easy task. “Imagine if we could take data from doctors that have the best performance and share that with other doctors that have less training and experience,” Ghassemi says. “We really need to collect this data and audit it.”

    The challenge here is that the collection of data is not incentivized or rewarded, she notes. “It’s not easy to get a grant for that, or ask students to spend time on it. And data providers might say, ‘Why should I give my data out for free when I can sell it to a company for millions?’ But researchers should be able to access data without having to deal with questions like: ‘What paper will I get my name on in exchange for giving you access to data that sits at my institution?’

    “The only way to get better health care is to get better data,” Ghassemi says, “and the only way to get better data is to incentivize its release.”

    It’s not only a question of collecting data. There’s also the matter of who will collect it and vet it. Ghassemi recommends assembling diverse groups of researchers — clinicians, statisticians, medical ethicists, and computer scientists — to first gather diverse patient data and then “focus on developing fair and equitable improvements in health care that can be deployed in not just one advanced medical setting, but in a wide range of medical settings.”

    The objective of the Patterns paper is not to discourage technologists from bringing their expertise in machine learning to the medical world, she says. “They just need to be cognizant of the gaps that appear in treatment and other complexities that ought to be considered before giving their stamp of approval to a particular computer model.” More