More stories

  • in

    Estimating the informativeness of data

    Not all data are created equal. But how much information is any piece of data likely to contain? This question is central to medical testing, designing scientific experiments, and even to everyday human learning and thinking. MIT researchers have developed a new way to solve this problem, opening up new applications in medicine, scientific discovery, cognitive science, and artificial intelligence.

    In theory, the 1948 paper, “A Mathematical Theory of Communication,” by the late MIT Professor Emeritus Claude Shannon answered this question definitively. One of Shannon’s breakthrough results is the idea of entropy, which lets us quantify the amount of information inherent in any random object, including random variables that model observed data. Shannon’s results created the foundations of information theory and modern telecommunications. The concept of entropy has also proven central to computer science and machine learning.

    The challenge of estimating entropy

    Unfortunately, the use of Shannon’s formula can quickly become computationally intractable. It requires precisely calculating the probability of the data, which in turn requires calculating every possible way the data could have arisen under a probabilistic model. If the data-generating process is very simple — for example, a single toss of a coin or roll of a loaded die — then calculating entropies is straightforward. But consider the problem of medical testing, where a positive test result is the result of hundreds of interacting variables, all unknown. With just 10 unknowns, there are already 1,000 possible explanations for the data. With a few hundred, there are more possible explanations than atoms in the known universe, which makes calculating the entropy exactly an unmanageable problem.

    MIT researchers have developed a new method to estimate good approximations to many information quantities such as Shannon entropy by using probabilistic inference. The work appears in a paper presented at AISTATS 2022 by authors Feras Saad ’16, MEng ’16, a PhD candidate in electrical engineering and computer science; Marco-Cusumano Towner PhD ’21; and Vikash Mansinghka ’05, MEng ’09, PhD ’09, a principal research scientist in the Department of Brain and Cognitive Sciences. The key insight is, rather than enumerate all explanations, to instead use probabilistic inference algorithms to first infer which explanations are probable and then use these probable explanations to construct high-quality entropy estimates. The paper shows that this inference-based approach can be much faster and more accurate than previous approaches.

    Estimating entropy and information in a probabilistic model is fundamentally hard because it often requires solving a high-dimensional integration problem. Many previous works have developed estimators of these quantities for certain special cases, but the new estimators of entropy via inference (EEVI) offer the first approach that can deliver sharp upper and lower bounds on a broad set of information-theoretic quantities. An upper and lower bound means that although we don’t know the true entropy, we can get a number that is smaller than it and a number that is higher than it.

    “The upper and lower bounds on entropy delivered by our method are particularly useful for three reasons,” says Saad. “First, the difference between the upper and lower bounds gives a quantitative sense of how confident we should be about the estimates. Second, by using more computational effort we can drive the difference between the two bounds to zero, which ‘squeezes’ the true value with a high degree of accuracy. Third, we can compose these bounds to form estimates of many other quantities that tell us how informative different variables in a model are of one another.”

    Solving fundamental problems with data-driven expert systems

    Saad says he is most excited about the possibility that this method gives for querying probabilistic models in areas like machine-assisted medical diagnoses. He says one goal of the EEVI method is to be able to solve new queries using rich generative models for things like liver disease and diabetes that have already been developed by experts in the medical domain. For example, suppose we have a patient with a set of observed attributes (height, weight, age, etc.) and observed symptoms (nausea, blood pressure, etc.). Given these attributes and symptoms, EEVI can be used to help determine which medical tests for symptoms the physician should conduct to maximize information about the absence or presence of a given liver disease (like cirrhosis or primary biliary cholangitis).

    For insulin diagnosis, the authors showed how to use the method for computing optimal times to take blood glucose measurements that maximize information about a patient’s insulin sensitivity, given an expert-built probabilistic model of insulin metabolism and the patient’s personalized meal and medication schedule. As routine medical tracking like glucose monitoring moves away from doctor’s offices and toward wearable devices, there are even more opportunities to improve data acquisition, if the value of the data can be estimated accurately in advance.

    Vikash Mansinghka, senior author on the paper, adds, “We’ve shown that probabilistic inference algorithms can be used to estimate rigorous bounds on information measures that AI engineers often think of as intractable to calculate. This opens up many new applications. It also shows that inference may be more computationally fundamental than we thought. It also helps to explain how human minds might be able to estimate the value of information so pervasively, as a central building block of everyday cognition, and help us engineer AI expert systems that have these capabilities.”

    The paper, “Estimators of Entropy and Information via Inference in Probabilistic Models,” was presented at AISTATS 2022. More

  • in

    Emery Brown earns American Institute for Medical and Biological Engineering Pierre Galletti Award

    The American Institute for Medical and Biological Engineering has awarded its highest honor this year to Emery N. Brown, the Edward Hood Taplin Professor of Computational Neuroscience and Health Sciences and Technology in The Picower Institute for Learning and Memory and the Institute for Medical Engineering and Science at MIT.

    Brown, who is also an anesthesiologist at Massachusetts General Hospital and the Warren M. Zapol Professor at Harvard Medical School, received the 2022 Pierre M. Galletti Award during the national organization’s Annual Event held on March 25.

    For decades, Brown’s lab has uniquely unified three fields: neuroscience, statistics, and anesthesiology. He is renowned for the development of statistical methods and signal-processing algorithms to enable and improve analysis of neural activity measurements. The work has had numerous applications including studies of learning and memory, brain-computer interfaces, and systems neuroscience. He has also pioneered investigations of how general anesthetic drugs work in the brain to induce and maintain simultaneous but reversible states of unconsciousness, amnesia, immobility, and analgesia. Building on these improvements in fundamental understanding, his lab engineers systems to improve monitoring of patient state and anesthetic dosing during surgery. Optimizing doses of general anesthetic drugs can improve patient care in many ways, including by minimizing side effects such as post-operative delirium and by improving post-operative pain management.

    AIMBE said Brown earned the award in recognition of his “significant contributions to neuroscience data analysis and for characterizing the neurophysiology of anesthesia-induced unconsciousness and demonstrating how it can be reliably monitored in real time using electroencephalogram recordings.”

    Brown, who is also a faculty member in MIT’s Department of Brain and Cognitive Sciences, is now working to develop a research center at MIT dedicated to taking neuroscience-based approaches to advance anesthesiology.

    “I am extremely honored and grateful to the AIMBE for choosing me to receive the 2022 Galletti Award in recognition of my research deciphering the neuroscience of how anesthetics work,” he says. “I would like to express my gratitude to my collaborators, post-doctoral fellows, students, research assistants, and clinical coordinators who have made this possible.” More

  • in

    Can machine-learning models overcome biased datasets?

    Artificial intelligence systems may be able to complete tasks quickly, but that doesn’t mean they always do so fairly. If the datasets used to train machine-learning models contain biased data, it is likely the system could exhibit that same bias when it makes decisions in practice.

    For instance, if a dataset contains mostly images of white men, then a facial-recognition model trained with these data may be less accurate for women or people with different skin tones.

    A group of researchers at MIT, in collaboration with researchers at Harvard University and Fujitsu Ltd., sought to understand when and how a machine-learning model is capable of overcoming this kind of dataset bias. They used an approach from neuroscience to study how training data affects whether an artificial neural network can learn to recognize objects it has not seen before. A neural network is a machine-learning model that mimics the human brain in the way it contains layers of interconnected nodes, or “neurons,” that process data.

    The new results show that diversity in training data has a major influence on whether a neural network is able to overcome bias, but at the same time dataset diversity can degrade the network’s performance. They also show that how a neural network is trained, and the specific types of neurons that emerge during the training process, can play a major role in whether it is able to overcome a biased dataset.

    “A neural network can overcome dataset bias, which is encouraging. But the main takeaway here is that we need to take into account data diversity. We need to stop thinking that if you just collect a ton of raw data, that is going to get you somewhere. We need to be very careful about how we design datasets in the first place,” says Xavier Boix, a research scientist in the Department of Brain and Cognitive Sciences (BCS) and the Center for Brains, Minds, and Machines (CBMM), and senior author of the paper.  

    Co-authors include former MIT graduate students Timothy Henry, Jamell Dozier, Helen Ho, Nishchal Bhandari, and Spandan Madan, a corresponding author who is currently pursuing a PhD at Harvard; Tomotake Sasaki, a former visiting scientist now a senior researcher at Fujitsu Research; Frédo Durand, a professor of electrical engineering and computer science at MIT and a member of the Computer Science and Artificial Intelligence Laboratory; and Hanspeter Pfister, the An Wang Professor of Computer Science at the Harvard School of Enginering and Applied Sciences. The research appears today in Nature Machine Intelligence.

    Thinking like a neuroscientist

    Boix and his colleagues approached the problem of dataset bias by thinking like neuroscientists. In neuroscience, Boix explains, it is common to use controlled datasets in experiments, meaning a dataset in which the researchers know as much as possible about the information it contains.

    The team built datasets that contained images of different objects in varied poses, and carefully controlled the combinations so some datasets had more diversity than others. In this case, a dataset had less diversity if it contains more images that show objects from only one viewpoint. A more diverse dataset had more images showing objects from multiple viewpoints. Each dataset contained the same number of images.

    The researchers used these carefully constructed datasets to train a neural network for image classification, and then studied how well it was able to identify objects from viewpoints the network did not see during training (known as an out-of-distribution combination). 

    For example, if researchers are training a model to classify cars in images, they want the model to learn what different cars look like. But if every Ford Thunderbird in the training dataset is shown from the front, when the trained model is given an image of a Ford Thunderbird shot from the side, it may misclassify it, even if it was trained on millions of car photos.

    The researchers found that if the dataset is more diverse — if more images show objects from different viewpoints — the network is better able to generalize to new images or viewpoints. Data diversity is key to overcoming bias, Boix says.

    “But it is not like more data diversity is always better; there is a tension here. When the neural network gets better at recognizing new things it hasn’t seen, then it will become harder for it to recognize things it has already seen,” he says.

    Testing training methods

    The researchers also studied methods for training the neural network.

    In machine learning, it is common to train a network to perform multiple tasks at the same time. The idea is that if a relationship exists between the tasks, the network will learn to perform each one better if it learns them together.

    But the researchers found the opposite to be true — a model trained separately for each task was able to overcome bias far better than a model trained for both tasks together.

    “The results were really striking. In fact, the first time we did this experiment, we thought it was a bug. It took us several weeks to realize it was a real result because it was so unexpected,” he says.

    They dove deeper inside the neural networks to understand why this occurs.

    They found that neuron specialization seems to play a major role. When the neural network is trained to recognize objects in images, it appears that two types of neurons emerge — one that specializes in recognizing the object category and another that specializes in recognizing the viewpoint.

    When the network is trained to perform tasks separately, those specialized neurons are more prominent, Boix explains. But if a network is trained to do both tasks simultaneously, some neurons become diluted and don’t specialize for one task. These unspecialized neurons are more likely to get confused, he says.

    “But the next question now is, how did these neurons get there? You train the neural network and they emerge from the learning process. No one told the network to include these types of neurons in its architecture. That is the fascinating thing,” he says.

    That is one area the researchers hope to explore with future work. They want to see if they can force a neural network to develop neurons with this specialization. They also want to apply their approach to more complex tasks, such as objects with complicated textures or varied illuminations.

    Boix is encouraged that a neural network can learn to overcome bias, and he is hopeful their work can inspire others to be more thoughtful about the datasets they are using in AI applications.

    This work was supported, in part, by the National Science Foundation, a Google Faculty Research Award, the Toyota Research Institute, the Center for Brains, Minds, and Machines, Fujitsu Research, and the MIT-Sensetime Alliance on Artificial Intelligence. More

  • in

    Professor Emery Brown has big plans for anesthesiology

    Emery N. Brown — the Edward Hood Taplin Professor of Medical Engineering and of Computational Neuroscience at MIT, an MIT professor of health sciences and technology, an investigator with The Picower Institute for Learning and Memory at MIT, and the Warren M. Zapol Professor of Anaesthesia at Harvard Medical School and Massachusetts General Hospital (MGH) — clearly excels at many roles. Renowned internationally for his anesthesia and neuroscience research, he embodies a unique blend of anesthesiologist, statistician, neuroscientist, educator, and mentor to both students and colleagues. Notably, Brown is one of the most decorated clinician-scientists in the country; he is one of only 25 people — and the first African-American, statistician, and anesthesiologist — to be elected to all three National Academies (Science, Engineering, and Medicine).

    Now, he is handing off one of his many key roles and responsibilities. After almost 10 years, Brown is stepping down as co-director of the Harvard-MIT Program in Health Sciences and Technology (HST). He will turn his energies toward working to develop a new joint center between MIT and MGH that uses the study of anesthesia to design novel approaches to controlling brain states. While a goal of the new center will be to improve anesthesia and intensive care unit management, according to Brown, it will also study related problems such as treating depression, insomnia, and epilepsy, as well as enhancing coma recovery.

    Founded in 1970, HST is one of the oldest interdisciplinary educational programs focused on training the next generation of clinician-scientists and engineers, who learn to translate science, engineering, and medical research into clinical practice, with the aim of improving human health. The MIT Institute for Medical Engineering and Science (IMES), where Brown is associate director, is HST’s home at MIT. Brown was the first HST co-director after the establishment of IMES in 2012; Wolfram Goessling is the Harvard University co-director of HST.

    “Emery has been an exemplary leader for HST during his tenure, and has helped it become a hub for the training of world-class scientists, engineers, and clinicians,” says Anantha Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “I am deeply grateful for his many years of service and wish him well as he moves on to new endeavors.”

    Elazer R. Edelman, director of IMES, calls Brown “a phenom who has been dedicated to our programs for years.”

    “With his thoughtful leadership and understated style, Emery made many contributions to the HST community,” Edelman continues. “On a personal note, this is bittersweet for me, as Emery has been a partner and mentor in my role as IMES director. And while I know that he will always be there for me, as he has been for all of us at IMES and HST, I will miss our late-night calls and midday conferences on matters of import for MIT, IMES, and HST.”

    Brown says “it was an honor and a privilege to co-direct HST with Wolfram.”

    “The students, staff, and faculty are simply amazing,” Brown continues. “Although, now more than 50 years old, HST remains at the vanguard for training PhD and MD students to work at the intersection between engineering, science, and medicine.”

    Goessling also thanks Brown for his leadership: “I truly valued Emery’s partnership and friendship, working together to deepen ties between the MIT and Harvard sides of HST. I am particularly grateful for working with Emery on our combined diversity efforts, leading to the HST Diversity Ambassadors initiative that made HST a better and stronger program.”

    According to Edelman, Brown was instrumental in the transition to new paradigms and relationships with HMS in the context of IMES. In 2014, he led the establishment of clear criteria for HST faculty membership, thereby strengthening the community of faculty experts who train students and provide research opportunities. More recently, he provided guidance through the turmoil of the ongoing Covid-19 pandemic, including the transition to online instruction and the return to the classroom. And Brown has always been a strong supporter of student diversity efforts, serving as an advocate and advisor to HST students.

    Brown holds BA, MA, and PhD degrees from Harvard University, and an MD from Harvard Medical School. He has been recognized with many awards, including the 2020 Swartz Prize in Theoretical and Computational Neuroscience, the 2018 Dickson Prize in Science, and an NIH Director’s Pioneer Award. Brown also served on President Barack Obama’s BRAIN Initiative Working Group. Among his many accomplishments, he has been cited for developing neural signal processing algorithms to characterize how neural systems represent and transmit information, and for unlocking the neurophysiology of how anesthetics produce the states of general anesthesia.

    Edelman says the process is underway to name a successor to Brown as co-director of HST at MIT. More

  • in

    New integrative computational neuroscience center established at MIT’s McGovern Institute

    With the tools of modern neuroscience, researchers can peer into the brain with unprecedented accuracy. Recording devices listen in on the electrical conversations between neurons, picking up the voices of hundreds of cells at a time. Genetic tools allow us to focus on specific types of neurons based on their molecular signatures. Microscopes zoom in to illuminate the brain’s circuitry, capturing thousands of images of elaborately branched dendrites. Functional MRIs detect changes in blood flow to map activity within a person’s brain, generating a complete picture by compiling hundreds of scans.

    This deluge of data provides insights into brain function and dynamics at different levels — molecules, cells, circuits, and behavior — but the insights remain compartmentalized in separate research silos for each level. An innovative new center at MIT’s McGovern Institute for Brain Research aims to leverage them into powerful revelations of the brain’s inner workings.

    The K. Lisa Yang Integrative Computational Neuroscience (ICoN) Center will create advanced mathematical models and computational tools to synthesize the deluge of data across scales and advance our understanding of the brain and mental health.

    The center, funded by a $24 million donation from philanthropist Lisa Yang and led by McGovern Institute Associate Investigator Ila Fiete, will take a collaborative approach to computational neuroscience, integrating cutting-edge modeling techniques and data from MIT labs to explain brain function at every level, from the molecular to the behavioral.

    “Our goal is that sophisticated, truly integrated computational models of the brain will make it possible to identify how ‘control knobs’ such as genes, proteins, chemicals, and environment drive thoughts and behavior, and to make inroads toward urgent unmet needs in understanding and treating brain disorders,” says Fiete, who is also a brain and cognitive sciences professor at MIT.

    “Driven by technologies that generate massive amounts of data, we are entering a new era of translational neuroscience research,” says Yang, whose philanthropic investment in MIT research now exceeds $130 million. “I am confident that the multidisciplinary expertise convened by the ICoN center will revolutionize how we synthesize this data and ultimately understand the brain in health and disease.”

    Connecting the data

    It is impossible to separate the molecules in the brain from their effects on behavior — although those aspects of neuroscience have traditionally been studied independently, by researchers with vastly different expertise. The ICoN Center will eliminate the divides, bringing together neuroscientists and software engineers to deal with all types of data about the brain.

    “The center’s highly collaborative structure, which is essential for unifying multiple levels of understanding, will enable us to recruit talented young scientists eager to revolutionize the field of computational neuroscience,” says Robert Desimone, director of the McGovern Institute. “It is our hope that the ICoN Center’s unique research environment will truly demonstrate a new academic research structure that catalyzes bold, creative research.”

    To foster interdisciplinary collaboration, every postdoc and engineer at the center will work with multiple faculty mentors. In order to attract young scientists and engineers to the field of computational neuroscience, the center will also provide four graduate fellowships to MIT students each year in perpetuity. Interacting closely with three scientific cores, engineers and fellows will develop computational models and technologies for analyzing molecular data, neural circuits, and behavior, such as tools to identify patterns in neural recordings or automate the analysis of human behavior to aid psychiatric diagnoses. These technologies and models will be instrumental in synthesizing data into knowledge and understanding.

    Center priorities

    In its first five years, the ICoN Center will prioritize four areas of investigation: episodic memory and exploration, including functions like navigation and spatial memory; complex or stereotypical behavior, such as the perseverative behaviors associated with autism and obsessive-compulsive disorder; cognition and attention; and sleep. Models of complex behavior will be created in collaboration with clinicians and researchers at Children’s Hospital of Philadelphia.

    The goal, Fiete says, is to model the neuronal interactions that underlie these functions so that researchers can predict what will happen when something changes — when certain neurons become more active or when a genetic mutation is introduced, for example. When paired with experimental data from MIT labs, the center’s models will help explain not just how these circuits work, but also how they are altered by genes, the environment, aging, and disease. These focus areas encompass circuits and behaviors often affected by psychiatric disorders and neurodegeneration, and models will give researchers new opportunities to explore their origins and potential treatment strategies.

    “Lisa Yang is focused on helping the scientific community realize its goals in translational research,” says Nergis Mavalvala, dean of the School of Science and the Curtis and Kathleen Marble Professor of Astrophysics. “With her generous support, we can accelerate the pace of research by connecting the data to the delivery of tangible results.” More

  • in

    MIT welcomes nine MLK Visiting Professors and Scholars for 2021-22

    In its 31st year, the Martin Luther King Jr. (MLK) Visiting Professors and Scholars Program will host nine outstanding scholars from across the Americas. The flagship program honors the life and legacy of Martin Luther King Jr. by increasing the presence and recognizing the contributions of underrepresented minority scholars at MIT. Throughout the year, the cohort will enhance their scholarship through intellectual engagement with the MIT community and enrich the cultural, academic, and professional experience of students.

    The 2021-22 scholars

    Sanford Biggers is an interdisciplinary artist hosted by the Department of Architecture. His work is an interplay of narrative, perspective, and history that speaks to current social, political, and economic happenings while examining their contexts. His diverse practice positions him as a collaborator with the past through explorations of often-overlooked cultural and political narratives from American history. Through collaboration with his faculty host, Brandon Clifford, he will spend the year contributing to projects with Architecture; Art, Culture and Technology; the Transmedia Storytelling initiatives; and community workshops and engagement with local K-12 education.

    Kristen Dorsey is an assistant professor of engineering at Smith College. She will be hosted by the Program in Media Arts and Sciences at the MIT Media Lab. Her research focuses on the fabrication and characterization of microscale sensors and microelectromechanical systems. Dorsey tries to understand “why things go wrong” by investigating device reliability and stability. At MIT, Dorsey is interested in forging collaborations to consider issues of access and equity as they apply to wearable health care devices.

    Omolola “Lola” Eniola-Adefeso is the associate dean for graduate and professional education and associate professor of chemical engineering at the University of Michigan. She will join MIT’s Department of Chemical Engineering (ChemE). Eniola-Adefeso will work with Professor Paula Hammond on developing electrostatically assembled nanoparticle coatings that enable targeting of specific immune cell types. A co-founder and chief scientific officer of Asalyxa Bio, she is interested in the interactions between blood leukocytes and endothelial cells in vessel lumen lining, and how they change during inflammation response. Eniola-Adefeso will also work with the Diversity in Chemical Engineering (DICE) graduate student group in ChemE and the National Organization of Black Chemists and Chemical Engineers.

    Robert Gilliard Jr. is an assistant professor of chemistry at the University of Virginia and will join the MIT chemistry department, working closely with faculty host Christopher Cummins. His research focuses on various aspects of group 15 element chemistry. He was a founding member of the National Organization of Black Chemists and Chemical Engineers UGA section, and he has served as an American Chemical Society (ACS) Bridge Program mentor as well as an ACS Project Seed mentor. Gilliard has also collaborated with the Cleveland Public Library to expose diverse young scholars to STEM fields.

    Valencia Joyner Koomson ’98, MNG ’99 will return for the second semester of her appointment this fall in MIT’s Department of Electrical Engineering and Computer Science. Based at Tufts University, where she is an associate professor in the Department of Electrical and Computer Engineering, Koomson has focused her research on microelectronic systems for cell analysis and biomedical applications. In the past semester, she has served as a judge for the Black Alumni/ae of MIT Research Slam and worked closely with faculty host Professor Akintunde Akinwande.

    Luis Gilberto Murillo-Urrutia will continue his appointment in MIT’s Environmental Solutions Initiative. He has 30 years of experience in public policy design, implementation, and advocacy, most notably in the areas of sustainable regional development, environmental protection and management of natural resources, social inclusion, and peace building. At MIT, he has continued his research on environmental justice, with a focus on carbon policy and its impacts on Afro-descendant communities in Colombia.

    Sonya T. Smith was the first female professor of mechanical engineering at Howard University. She will join the Department of Aeronautics and Astronautics at MIT. Her research involves computational fluid dynamics and thermal management of electronics for air and space vehicles. She is looking forward to serving as a mentor to underrepresented students across MIT and fostering new research collaborations with her home lab at Howard.

    Lawrence Udeigwe is an associate professor of mathematics at Manhattan College and will join MIT’s Department of Brain and Cognitive Sciences. He plans to co-teach a graduate seminar course with Professor James DiCarlo to explore practical and philosophical questions regarding the use of simulations to build theories in neuroscience. Udeigwe also leads the Lorens Chuno group; as a singer-songwriter, his work tackles intersectionality issues faced by contemporary Africans.

    S. Craig Watkins is an internationally recognized expert in media and a professor at the University of Texas at Austin. He will join MIT’s Institute for Data, Systems, and Society to assist in researching the role of big data in enabling deep structural changes with regard to systemic racism. He will continue to expand on his work as founding director of the Institute for Media Innovation at the University of Texas at Austin, exploring the intersections of critical AI studies, critical race studies, and design. He will also work with MIT’s Center for Advanced Virtuality to develop computational systems that support social perspective-taking.

    Community engagement

    Throughout the 2021-22 academic year, MLK professors and scholars will be presenting their research at a monthly speaker series. Events will be held in an in-person/Zoom hybrid environment. All members of the MIT community are encouraged to attend and hear directly from this year’s cohort of outstanding scholars. To hear more about upcoming events, subscribe to their mailing list.

    On Sept. 15, all are invited to join the Institute Community and Equity Office in welcoming the scholars to campus by attending a welcome luncheon. More

  • in

    Exact symbolic artificial intelligence for faster, better assessment of AI fairness

    The justice system, banks, and private companies use algorithms to make decisions that have profound impacts on people’s lives. Unfortunately, those algorithms are sometimes biased — disproportionately impacting people of color as well as individuals in lower income classes when they apply for loans or jobs, or even when courts decide what bail should be set while a person awaits trial.

    MIT researchers have developed a new artificial intelligence programming language that can assess the fairness of algorithms more exactly, and more quickly, than available alternatives.

    Their Sum-Product Probabilistic Language (SPPL) is a probabilistic programming system. Probabilistic programming is an emerging field at the intersection of programming languages and artificial intelligence that aims to make AI systems much easier to develop, with early successes in computer vision, common-sense data cleaning, and automated data modeling. Probabilistic programming languages make it much easier for programmers to define probabilistic models and carry out probabilistic inference — that is, work backward to infer probable explanations for observed data.

    “There are previous systems that can solve various fairness questions. Our system is not the first; but because our system is specialized and optimized for a certain class of models, it can deliver solutions thousands of times faster,” says Feras Saad, a PhD student in electrical engineering and computer science (EECS) and first author on a recent paper describing the work. Saad adds that the speedups are not insignificant: The system can be up to 3,000 times faster than previous approaches.

    SPPL gives fast, exact solutions to probabilistic inference questions such as “How likely is the model to recommend a loan to someone over age 40?” or “Generate 1,000 synthetic loan applicants, all under age 30, whose loans will be approved.” These inference results are based on SPPL programs that encode probabilistic models of what kinds of applicants are likely, a priori, and also how to classify them. Fairness questions that SPPL can answer include “Is there a difference between the probability of recommending a loan to an immigrant and nonimmigrant applicant with the same socioeconomic status?” or “What’s the probability of a hire, given that the candidate is qualified for the job and from an underrepresented group?”

    SPPL is different from most probabilistic programming languages, as SPPL only allows users to write probabilistic programs for which it can automatically deliver exact probabilistic inference results. SPPL also makes it possible for users to check how fast inference will be, and therefore avoid writing slow programs. In contrast, other probabilistic programming languages such as Gen and Pyro allow users to write down probabilistic programs where the only known ways to do inference are approximate — that is, the results include errors whose nature and magnitude can be hard to characterize.

    Error from approximate probabilistic inference is tolerable in many AI applications. But it is undesirable to have inference errors corrupting results in socially impactful applications of AI, such as automated decision-making, and especially in fairness analysis.

    Jean-Baptiste Tristan, associate professor at Boston College and former research scientist at Oracle Labs, who was not involved in the new research, says, “I’ve worked on fairness analysis in academia and in real-world, large-scale industry settings. SPPL offers improved flexibility and trustworthiness over other PPLs on this challenging and important class of problems due to the expressiveness of the language, its precise and simple semantics, and the speed and soundness of the exact symbolic inference engine.”

    SPPL avoids errors by restricting to a carefully designed class of models that still includes a broad class of AI algorithms, including the decision tree classifiers that are widely used for algorithmic decision-making. SPPL works by compiling probabilistic programs into a specialized data structure called a “sum-product expression.” SPPL further builds on the emerging theme of using probabilistic circuits as a representation that enables efficient probabilistic inference. This approach extends prior work on sum-product networks to models and queries expressed via a probabilistic programming language. However, Saad notes that this approach comes with limitations: “SPPL is substantially faster for analyzing the fairness of a decision tree, for example, but it can’t analyze models like neural networks. Other systems can analyze both neural networks and decision trees, but they tend to be slower and give inexact answers.”

    “SPPL shows that exact probabilistic inference is practical, not just theoretically possible, for a broad class of probabilistic programs,” says Vikash Mansinghka, an MIT principal research scientist and senior author on the paper. “In my lab, we’ve seen symbolic inference driving speed and accuracy improvements in other inference tasks that we previously approached via approximate Monte Carlo and deep learning algorithms. We’ve also been applying SPPL to probabilistic programs learned from real-world databases, to quantify the probability of rare events, generate synthetic proxy data given constraints, and automatically screen data for probable anomalies.”

    The new SPPL probabilistic programming language was presented in June at the ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI), in a paper that Saad co-authored with MIT EECS Professor Martin Rinard and Mansinghka. SPPL is implemented in Python and is available open source. More