More stories

  • in

    How artificial intelligence can help combat systemic racism

    In 2020, Detroit police arrested a Black man for shoplifting almost $4,000 worth of watches from an upscale boutique. He was handcuffed in front of his family and spent a night in lockup. After some questioning, however, it became clear that they had the wrong man. So why did they arrest him in the first place?

    The reason: a facial recognition algorithm had matched the photo on his driver’s license to grainy security camera footage.

    Facial recognition algorithms — which have repeatedly been demonstrated to be less accurate for people with darker skin — are just one example of how racial bias gets replicated within and perpetuated by emerging technologies.

    “There’s an urgency as AI is used to make really high-stakes decisions,” says MLK Visiting Professor S. Craig Watkins, whose academic home for his time at MIT is the Institute for Data, Systems, and Society (IDSS). “The stakes are higher because new systems can replicate historical biases at scale.”

    Watkins, a professor at the University of Texas at Austin and the founding director of the Institute for Media Innovation​, researches the impacts of media and data-based systems on human behavior, with a specific concentration on issues related to systemic racism. “One of the fundamental questions of the work is: how do we build AI models that deal with systemic inequality more effectively?”

    Play video

    Artificial Intelligence and the Future of Racial Justice | S. Craig Watkins | TEDxMIT

    Ethical AI

    Inequality is perpetuated by technology in many ways across many sectors. One broad domain is health care, where Watkins says inequity shows up in both quality of and access to care. The demand for mental health care, for example, far outstrips the capacity for services in the United States. That demand has been exacerbated by the pandemic, and access to care is harder for communities of color.

    For Watkins, taking the bias out of the algorithm is just one component of building more ethical AI. He works also to develop tools and platforms that can address inequality outside of tech head-on. In the case of mental health access, this entails developing a tool to help mental health providers deliver care more efficiently.

    “We are building a real-time data collection platform that looks at activities and behaviors and tries to identify patterns and contexts in which certain mental states emerge,” says Watkins. “The goal is to provide data-informed insights to care providers in order to deliver higher-impact services.”

    Watkins is no stranger to the privacy concerns such an app would raise. He takes a user-centered approach to the development that is grounded in data ethics. “Data rights are a significant component,” he argues. “You have to give the user complete control over how their data is shared and used and what data a care provider sees. No one else has access.”

    Combating systemic racism

    Here at MIT, Watkins has joined the newly launched Initiative on Combatting Systemic Racism (ICSR), an IDSS research collaboration that brings together faculty and researchers from the MIT Stephen A. Schwarzman College of Computing and beyond. The aim of the ICSR is to develop and harness computational tools that can help effect structural and normative change toward racial equity.

    The ICSR collaboration has separate project teams researching systemic racism in different sectors of society, including health care. Each of these “verticals” addresses different but interconnected issues, from sustainability to employment to gaming. Watkins is a part of two ICSR groups, policing and housing, that aim to better understand the processes that lead to discriminatory practices in both sectors. “Discrimination in housing contributes significantly to the racial wealth gap in the U.S.,” says Watkins.

    The policing team examines patterns in how different populations get policed. “There is obviously a significant and charged history to policing and race in America,” says Watkins. “This is an attempt to understand, to identify patterns, and note regional differences.”

    Watkins and the policing team are building models using data that details police interventions, responses, and race, among other variables. The ICSR is a good fit for this kind of research, says Watkins, who notes the interdisciplinary focus of both IDSS and the SCC. 

    “Systemic change requires a collaborative model and different expertise,” says Watkins. “We are trying to maximize influence and potential on the computational side, but we won’t get there with computation alone.”

    Opportunities for change

    Models can also predict outcomes, but Watkins is careful to point out that no algorithm alone will solve racial challenges.

    “Models in my view can inform policy and strategy that we as humans have to create. Computational models can inform and generate knowledge, but that doesn’t equate with change.” It takes additional work — and additional expertise in policy and advocacy — to use knowledge and insights to strive toward progress.

    One important lever of change, he argues, will be building a more AI-literate society through access to information and opportunities to understand AI and its impact in a more dynamic way. He hopes to see greater data rights and greater understanding of how societal systems impact our lives.

    “I was inspired by the response of younger people to the murders of George Floyd and Breonna Taylor,” he says. “Their tragic deaths shine a bright light on the real-world implications of structural racism and has forced the broader society to pay more attention to this issue, which creates more opportunities for change.” More

  • in

    Can machine-learning models overcome biased datasets?

    Artificial intelligence systems may be able to complete tasks quickly, but that doesn’t mean they always do so fairly. If the datasets used to train machine-learning models contain biased data, it is likely the system could exhibit that same bias when it makes decisions in practice.

    For instance, if a dataset contains mostly images of white men, then a facial-recognition model trained with these data may be less accurate for women or people with different skin tones.

    A group of researchers at MIT, in collaboration with researchers at Harvard University and Fujitsu Ltd., sought to understand when and how a machine-learning model is capable of overcoming this kind of dataset bias. They used an approach from neuroscience to study how training data affects whether an artificial neural network can learn to recognize objects it has not seen before. A neural network is a machine-learning model that mimics the human brain in the way it contains layers of interconnected nodes, or “neurons,” that process data.

    The new results show that diversity in training data has a major influence on whether a neural network is able to overcome bias, but at the same time dataset diversity can degrade the network’s performance. They also show that how a neural network is trained, and the specific types of neurons that emerge during the training process, can play a major role in whether it is able to overcome a biased dataset.

    “A neural network can overcome dataset bias, which is encouraging. But the main takeaway here is that we need to take into account data diversity. We need to stop thinking that if you just collect a ton of raw data, that is going to get you somewhere. We need to be very careful about how we design datasets in the first place,” says Xavier Boix, a research scientist in the Department of Brain and Cognitive Sciences (BCS) and the Center for Brains, Minds, and Machines (CBMM), and senior author of the paper.  

    Co-authors include former MIT graduate students Timothy Henry, Jamell Dozier, Helen Ho, Nishchal Bhandari, and Spandan Madan, a corresponding author who is currently pursuing a PhD at Harvard; Tomotake Sasaki, a former visiting scientist now a senior researcher at Fujitsu Research; Frédo Durand, a professor of electrical engineering and computer science at MIT and a member of the Computer Science and Artificial Intelligence Laboratory; and Hanspeter Pfister, the An Wang Professor of Computer Science at the Harvard School of Enginering and Applied Sciences. The research appears today in Nature Machine Intelligence.

    Thinking like a neuroscientist

    Boix and his colleagues approached the problem of dataset bias by thinking like neuroscientists. In neuroscience, Boix explains, it is common to use controlled datasets in experiments, meaning a dataset in which the researchers know as much as possible about the information it contains.

    The team built datasets that contained images of different objects in varied poses, and carefully controlled the combinations so some datasets had more diversity than others. In this case, a dataset had less diversity if it contains more images that show objects from only one viewpoint. A more diverse dataset had more images showing objects from multiple viewpoints. Each dataset contained the same number of images.

    The researchers used these carefully constructed datasets to train a neural network for image classification, and then studied how well it was able to identify objects from viewpoints the network did not see during training (known as an out-of-distribution combination). 

    For example, if researchers are training a model to classify cars in images, they want the model to learn what different cars look like. But if every Ford Thunderbird in the training dataset is shown from the front, when the trained model is given an image of a Ford Thunderbird shot from the side, it may misclassify it, even if it was trained on millions of car photos.

    The researchers found that if the dataset is more diverse — if more images show objects from different viewpoints — the network is better able to generalize to new images or viewpoints. Data diversity is key to overcoming bias, Boix says.

    “But it is not like more data diversity is always better; there is a tension here. When the neural network gets better at recognizing new things it hasn’t seen, then it will become harder for it to recognize things it has already seen,” he says.

    Testing training methods

    The researchers also studied methods for training the neural network.

    In machine learning, it is common to train a network to perform multiple tasks at the same time. The idea is that if a relationship exists between the tasks, the network will learn to perform each one better if it learns them together.

    But the researchers found the opposite to be true — a model trained separately for each task was able to overcome bias far better than a model trained for both tasks together.

    “The results were really striking. In fact, the first time we did this experiment, we thought it was a bug. It took us several weeks to realize it was a real result because it was so unexpected,” he says.

    They dove deeper inside the neural networks to understand why this occurs.

    They found that neuron specialization seems to play a major role. When the neural network is trained to recognize objects in images, it appears that two types of neurons emerge — one that specializes in recognizing the object category and another that specializes in recognizing the viewpoint.

    When the network is trained to perform tasks separately, those specialized neurons are more prominent, Boix explains. But if a network is trained to do both tasks simultaneously, some neurons become diluted and don’t specialize for one task. These unspecialized neurons are more likely to get confused, he says.

    “But the next question now is, how did these neurons get there? You train the neural network and they emerge from the learning process. No one told the network to include these types of neurons in its architecture. That is the fascinating thing,” he says.

    That is one area the researchers hope to explore with future work. They want to see if they can force a neural network to develop neurons with this specialization. They also want to apply their approach to more complex tasks, such as objects with complicated textures or varied illuminations.

    Boix is encouraged that a neural network can learn to overcome bias, and he is hopeful their work can inspire others to be more thoughtful about the datasets they are using in AI applications.

    This work was supported, in part, by the National Science Foundation, a Google Faculty Research Award, the Toyota Research Institute, the Center for Brains, Minds, and Machines, Fujitsu Research, and the MIT-Sensetime Alliance on Artificial Intelligence. More

  • in

    The downside of machine learning in health care

    While working toward her dissertation in computer science at MIT, Marzyeh Ghassemi wrote several papers on how machine-learning techniques from artificial intelligence could be applied to clinical data in order to predict patient outcomes. “It wasn’t until the end of my PhD work that one of my committee members asked: ‘Did you ever check to see how well your model worked across different groups of people?’”

    That question was eye-opening for Ghassemi, who had previously assessed the performance of models in aggregate, across all patients. Upon a closer look, she saw that models often worked differently — specifically worse — for populations including Black women, a revelation that took her by surprise. “I hadn’t made the connection beforehand that health disparities would translate directly to model disparities,” she says. “And given that I am a visible minority woman-identifying computer scientist at MIT, I am reasonably certain that many others weren’t aware of this either.”

    In a paper published Jan. 14 in the journal Patterns, Ghassemi — who earned her doctorate in 2017 and is now an assistant professor in the Department of Electrical Engineering and Computer Science and the MIT Institute for Medical Engineering and Science (IMES) — and her coauthor, Elaine Okanyene Nsoesie of Boston University, offer a cautionary note about the prospects for AI in medicine. “If used carefully, this technology could improve performance in health care and potentially reduce inequities,” Ghassemi says. “But if we’re not actually careful, technology could worsen care.”

    It all comes down to data, given that the AI tools in question train themselves by processing and analyzing vast quantities of data. But the data they are given are produced by humans, who are fallible and whose judgments may be clouded by the fact that they interact differently with patients depending on their age, gender, and race, without even knowing it.

    Furthermore, there is still great uncertainty about medical conditions themselves. “Doctors trained at the same medical school for 10 years can, and often do, disagree about a patient’s diagnosis,” Ghassemi says. That’s different from the applications where existing machine-learning algorithms excel — like object-recognition tasks — because practically everyone in the world will agree that a dog is, in fact, a dog.

    Machine-learning algorithms have also fared well in mastering games like chess and Go, where both the rules and the “win conditions” are clearly defined. Physicians, however, don’t always concur on the rules for treating patients, and even the win condition of being “healthy” is not widely agreed upon. “Doctors know what it means to be sick,” Ghassemi explains, “and we have the most data for people when they are sickest. But we don’t get much data from people when they are healthy because they’re less likely to see doctors then.”

    Even mechanical devices can contribute to flawed data and disparities in treatment. Pulse oximeters, for example, which have been calibrated predominately on light-skinned individuals, do not accurately measure blood oxygen levels for people with darker skin. And these deficiencies are most acute when oxygen levels are low — precisely when accurate readings are most urgent. Similarly, women face increased risks during “metal-on-metal” hip replacements, Ghassemi and Nsoesie write, “due in part to anatomic differences that aren’t taken into account in implant design.” Facts like these could be buried within the data fed to computer models whose output will be undermined as a result.

    Coming from computers, the product of machine-learning algorithms offers “the sheen of objectivity,” according to Ghassemi. But that can be deceptive and dangerous, because it’s harder to ferret out the faulty data supplied en masse to a computer than it is to discount the recommendations of a single possibly inept (and maybe even racist) doctor. “The problem is not machine learning itself,” she insists. “It’s people. Human caregivers generate bad data sometimes because they are not perfect.”

    Nevertheless, she still believes that machine learning can offer benefits in health care in terms of more efficient and fairer recommendations and practices. One key to realizing the promise of machine learning in health care is to improve the quality of data, which is no easy task. “Imagine if we could take data from doctors that have the best performance and share that with other doctors that have less training and experience,” Ghassemi says. “We really need to collect this data and audit it.”

    The challenge here is that the collection of data is not incentivized or rewarded, she notes. “It’s not easy to get a grant for that, or ask students to spend time on it. And data providers might say, ‘Why should I give my data out for free when I can sell it to a company for millions?’ But researchers should be able to access data without having to deal with questions like: ‘What paper will I get my name on in exchange for giving you access to data that sits at my institution?’

    “The only way to get better health care is to get better data,” Ghassemi says, “and the only way to get better data is to incentivize its release.”

    It’s not only a question of collecting data. There’s also the matter of who will collect it and vet it. Ghassemi recommends assembling diverse groups of researchers — clinicians, statisticians, medical ethicists, and computer scientists — to first gather diverse patient data and then “focus on developing fair and equitable improvements in health care that can be deployed in not just one advanced medical setting, but in a wide range of medical settings.”

    The objective of the Patterns paper is not to discourage technologists from bringing their expertise in machine learning to the medical world, she says. “They just need to be cognizant of the gaps that appear in treatment and other complexities that ought to be considered before giving their stamp of approval to a particular computer model.” More

  • in

    The promise and pitfalls of artificial intelligence explored at TEDxMIT event

    Scientists, students, and community members came together last month to discuss the promise and pitfalls of artificial intelligence at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) for the fourth TEDxMIT event held at MIT. 

    Attendees were entertained and challenged as they explored “the good and bad of computing,” explained CSAIL Director Professor Daniela Rus, who organized the event with John Werner, an MIT fellow and managing director of Link Ventures; MIT sophomore Lucy Zhao; and grad student Jessica Karaguesian. “As you listen to the talks today,” Rus told the audience, “consider how our world is made better by AI, and also our intrinsic responsibilities for ensuring that the technology is deployed for the greater good.”

    Rus mentioned some new capabilities that could be enabled by AI: an automated personal assistant that could monitor your sleep phases and wake you at the optimal time, as well as on-body sensors that monitor everything from your posture to your digestive system. “Intelligent assistance can help empower and augment our lives. But these intriguing possibilities should only be pursued if we can simultaneously resolve the challenges that these technologies bring,” said Rus. 

    The next speaker, CSAIL principal investigator and professor of electrical engineering and computer science Manolis Kellis, started off by suggesting what sounded like an unattainable goal — using AI to “put an end to evolution as we know it.” Looking at it from a computer science perspective, he said, what we call evolution is basically a brute force search. “You’re just exploring all of the search space, creating billions of copies of every one of your programs, and just letting them fight against each other. This is just brutal. And it’s also completely slow. It took us billions of years to get here.” Might it be possible, he asked, to speed up evolution and make it less messy?

    The answer, Kellis said, is that we can do better, and that we’re already doing better: “We’re not killing people like Sparta used to, throwing the weaklings off the mountain. We are truly saving diversity.”

    Knowledge, moreover, is now being widely shared, passed on “horizontally” through accessible information sources, he noted, rather than “vertically,” from parent to offspring. “I would like to argue that competition in the human species has been replaced by collaboration. Despite having a fixed cognitive hardware, we have software upgrades that are enabled by culture, by the 20 years that our children spend in school to fill their brains with everything that humanity has learned, regardless of which family came up with it. This is the secret of our great acceleration” — the fact that human advancement in recent centuries has vastly out-clipped evolution’s sluggish pace.

    The next step, Kellis said, is to harness insights about evolution in order to combat an individual’s genetic susceptibility to disease. “Our current approach is simply insufficient,” he added. “We’re treating manifestations of disease, not the causes of disease.” A key element in his lab’s ambitious strategy to transform medicine is to identify “the causal pathways through which genetic predisposition manifests. It’s only by understanding these pathways that we can truly manipulate disease causation and reverse the disease circuitry.” 

    Kellis was followed by Aleksander Madry, MIT professor of electrical engineering and computer science and CSAIL principal investigator, who told the crowd, “progress in AI is happening, and it’s happening fast.” Computer programs can routinely beat humans in games like chess, poker, and Go. So should we be worried about AI surpassing humans? 

    Madry, for one, is not afraid — or at least not yet. And some of that reassurance stems from research that has led him to the following conclusion: Despite its considerable success, AI, especially in the form of machine learning, is lazy. “Think about being lazy as this kind of smart student who doesn’t really want to study for an exam. Instead, what he does is just study all the past years’ exams and just look for patterns. Instead of trying to actually learn, he just tries to pass the test. And this is exactly the same way in which current AI is lazy.”

    A machine-learning model might recognize grazing sheep, for instance, simply by picking out pictures that have green grass in them. If a model is trained to identify fish from photos of anglers proudly displaying their catches, Madry explained, “the model figures out that if there’s a human holding something in the picture, I will just classify it as a fish.” The consequences can be more serious for an AI model intended to pick out malignant tumors. If the model is trained on images containing rulers that indicate the size of tumors, the model may end up selecting only those photos that have rulers in them.

    This leads to Madry’s biggest concerns about AI in its present form. “AI is beating us now,” he noted. “But the way it does it [involves] a little bit of cheating.” He fears that we will apply AI “in some way in which this mismatch between what the model actually does versus what we think it does will have some catastrophic consequences.” People relying on AI, especially in potentially life-or-death situations, need to be much more mindful of its current limitations, Madry cautioned.

    There were 10 speakers altogether, and the last to take the stage was MIT associate professor of electrical engineering and computer science and CSAIL principal investigator Marzyeh Ghassemi, who laid out her vision for how AI could best contribute to general health and well-being. But in order for that to happen, its models must be trained on accurate, diverse, and unbiased medical data.

    It’s important to focus on the data, Ghassemi stressed, because these models are learning from us. “Since our data is human-generated … a neural network is learning how to practice from a doctor. But doctors are human, and humans make mistakes. And if a human makes a mistake, and we train an AI from that, the AI will, too. Garbage in, garbage out. But it’s not like the garbage is distributed equally.”

    She pointed out that many subgroups receive worse care from medical practitioners, and members of these subgroups die from certain conditions at disproportionately high rates. This is an area, Ghassemi said, “where AI can actually help. This is something we can fix.” Her group is developing machine-learning models that are robust, private, and fair. What’s holding them back is neither algorithms nor GPUs. It’s data. Once we collect reliable data from diverse sources, Ghassemi added, we might start reaping the benefits that AI can bring to the realm of health care.

    In addition to CSAIL speakers, there were talks from members across MIT’s Institute for Data, Systems, and Society; the MIT Mobility Initiative; the MIT Media Lab; and the SENSEable City Lab.

    The proceedings concluded on that hopeful note. Rus and Werner then thanked everyone for coming. “Please continue to reflect about the good and bad of computing,” Rus urged. “And we look forward to seeing you back here in May for the next TEDxMIT event.”

    The exact theme of the spring 2022 gathering will have something to do with “superpowers.” But — if December’s mind-bending presentations were any indication — the May offering is almost certain to give its attendees plenty to think about. And maybe provide the inspiration for a startup or two. More

  • in

    Physics and the machine-learning “black box”

    Machine-learning algorithms are often referred to as a “black box.” Once data are put into an algorithm, it’s not always known exactly how the algorithm arrives at its prediction. This can be particularly frustrating when things go wrong. A new mechanical engineering (MechE) course at MIT teaches students how to tackle the “black box” problem, through a combination of data science and physics-based engineering.

    In class 2.C161 (Physical Systems Modeling and Design Using Machine Learning), Professor George Barbastathis demonstrates how mechanical engineers can use their unique knowledge of physical systems to keep algorithms in check and develop more accurate predictions.

    “I wanted to take 2.C161 because machine-learning models are usually a “black box,” but this class taught us how to construct a system model that is informed by physics so we can peek inside,” explains Crystal Owens, a mechanical engineering graduate student who took the course in spring 2021.

    As chair of the Committee on the Strategic Integration of Data Science into Mechanical Engineering, Barbastathis has had many conversations with mechanical engineering students, researchers, and faculty to better understand the challenges and successes they’ve had using machine learning in their work.

    “One comment we heard frequently was that these colleagues can see the value of data science methods for problems they are facing in their mechanical engineering-centric research; yet they are lacking the tools to make the most out of it,” says Barbastathis. “Mechanical, civil, electrical, and other types of engineers want a fundamental understanding of data principles without having to convert themselves to being full-time data scientists or AI researchers.”

    Additionally, as mechanical engineering students move on from MIT to their careers, many will need to manage data scientists on their teams someday. Barbastathis hopes to set these students up for success with class 2.C161.

    Bridging MechE and the MIT Schwartzman College of Computing

    Class 2.C161 is part of the MIT Schwartzman College of Computing “Computing Core.” The goal of these classes is to connect data science and physics-based engineering disciplines, like mechanical engineering. Students take the course alongside 6.C402 (Modeling with Machine Learning: from Algorithms to Applications), taught by professors of electrical engineering and computer science Regina Barzilay and Tommi Jaakkola.

    The two classes are taught concurrently during the semester, exposing students to both fundamentals in machine learning and domain-specific applications in mechanical engineering.

    In 2.C161, Barbastathis highlights how complementary physics-based engineering and data science are. Physical laws present a number of ambiguities and unknowns, ranging from temperature and humidity to electromagnetic forces. Data science can be used to predict these physical phenomena. Meanwhile, having an understanding of physical systems helps ensure the resulting output of an algorithm is accurate and explainable.

    “What’s needed is a deeper combined understanding of the associated physical phenomena and the principles of data science, machine learning in particular, to close the gap,” adds Barbastathis. “By combining data with physical principles, the new revolution in physics-based engineering is relatively immune to the “black box” problem facing other types of machine learning.”

    Equipped with a working knowledge of machine-learning topics covered in class 6.C402 and a deeper understanding of how to pair data science with physics, students are charged with developing a final project that solves for an actual physical system.

    Developing solutions for real-world physical systems

    For their final project, students in 2.C161 are asked to identify a real-world problem that requires data science to address the ambiguity inherent in physical systems. After obtaining all relevant data, students are asked to select a machine-learning method, implement their chosen solution, and present and critique the results.

    Topics this past semester ranged from weather forecasting to the flow of gas in combustion engines, with two student teams drawing inspiration from the ongoing Covid-19 pandemic.

    Owens and her teammates, fellow graduate students Arun Krishnadas and Joshua David John Rathinaraj, set out to develop a model for the Covid-19 vaccine rollout.

    “We developed a method of combining a neural network with a susceptible-infected-recovered (SIR) epidemiological model to create a physics-informed prediction system for the spread of Covid-19 after vaccinations started,” explains Owens.

    The team accounted for various unknowns including population mobility, weather, and political climate. This combined approach resulted in a prediction of Covid-19’s spread during the vaccine rollout that was more reliable than using either the SIR model or a neural network alone.

    Another team, including graduate student Yiwen Hu, developed a model to predict mutation rates in Covid-19, a topic that became all too pertinent as the delta variant began its global spread.

    “We used machine learning to predict the time-series-based mutation rate of Covid-19, and then incorporated that as an independent parameter into the prediction of pandemic dynamics to see if it could help us better predict the trend of the Covid-19 pandemic,” says Hu.

    Hu, who had previously conducted research into how vibrations on coronavirus protein spikes affect infection rates, hopes to apply the physics-based machine-learning approaches he learned in 2.C161 to his research on de novo protein design.

    Whatever the physical system students addressed in their final projects, Barbastathis was careful to stress one unifying goal: the need to assess ethical implications in data science. While more traditional computing methods like face or voice recognition have proven to be rife with ethical issues, there is an opportunity to combine physical systems with machine learning in a fair, ethical way.

    “We must ensure that collection and use of data are carried out equitably and inclusively, respecting the diversity in our society and avoiding well-known problems that computer scientists in the past have run into,” says Barbastathis.

    Barbastathis hopes that by encouraging mechanical engineering students to be both ethics-literate and well-versed in data science, they can move on to develop reliable, ethically sound solutions and predictions for physical-based engineering challenges. More

  • in

    Exact symbolic artificial intelligence for faster, better assessment of AI fairness

    The justice system, banks, and private companies use algorithms to make decisions that have profound impacts on people’s lives. Unfortunately, those algorithms are sometimes biased — disproportionately impacting people of color as well as individuals in lower income classes when they apply for loans or jobs, or even when courts decide what bail should be set while a person awaits trial.

    MIT researchers have developed a new artificial intelligence programming language that can assess the fairness of algorithms more exactly, and more quickly, than available alternatives.

    Their Sum-Product Probabilistic Language (SPPL) is a probabilistic programming system. Probabilistic programming is an emerging field at the intersection of programming languages and artificial intelligence that aims to make AI systems much easier to develop, with early successes in computer vision, common-sense data cleaning, and automated data modeling. Probabilistic programming languages make it much easier for programmers to define probabilistic models and carry out probabilistic inference — that is, work backward to infer probable explanations for observed data.

    “There are previous systems that can solve various fairness questions. Our system is not the first; but because our system is specialized and optimized for a certain class of models, it can deliver solutions thousands of times faster,” says Feras Saad, a PhD student in electrical engineering and computer science (EECS) and first author on a recent paper describing the work. Saad adds that the speedups are not insignificant: The system can be up to 3,000 times faster than previous approaches.

    SPPL gives fast, exact solutions to probabilistic inference questions such as “How likely is the model to recommend a loan to someone over age 40?” or “Generate 1,000 synthetic loan applicants, all under age 30, whose loans will be approved.” These inference results are based on SPPL programs that encode probabilistic models of what kinds of applicants are likely, a priori, and also how to classify them. Fairness questions that SPPL can answer include “Is there a difference between the probability of recommending a loan to an immigrant and nonimmigrant applicant with the same socioeconomic status?” or “What’s the probability of a hire, given that the candidate is qualified for the job and from an underrepresented group?”

    SPPL is different from most probabilistic programming languages, as SPPL only allows users to write probabilistic programs for which it can automatically deliver exact probabilistic inference results. SPPL also makes it possible for users to check how fast inference will be, and therefore avoid writing slow programs. In contrast, other probabilistic programming languages such as Gen and Pyro allow users to write down probabilistic programs where the only known ways to do inference are approximate — that is, the results include errors whose nature and magnitude can be hard to characterize.

    Error from approximate probabilistic inference is tolerable in many AI applications. But it is undesirable to have inference errors corrupting results in socially impactful applications of AI, such as automated decision-making, and especially in fairness analysis.

    Jean-Baptiste Tristan, associate professor at Boston College and former research scientist at Oracle Labs, who was not involved in the new research, says, “I’ve worked on fairness analysis in academia and in real-world, large-scale industry settings. SPPL offers improved flexibility and trustworthiness over other PPLs on this challenging and important class of problems due to the expressiveness of the language, its precise and simple semantics, and the speed and soundness of the exact symbolic inference engine.”

    SPPL avoids errors by restricting to a carefully designed class of models that still includes a broad class of AI algorithms, including the decision tree classifiers that are widely used for algorithmic decision-making. SPPL works by compiling probabilistic programs into a specialized data structure called a “sum-product expression.” SPPL further builds on the emerging theme of using probabilistic circuits as a representation that enables efficient probabilistic inference. This approach extends prior work on sum-product networks to models and queries expressed via a probabilistic programming language. However, Saad notes that this approach comes with limitations: “SPPL is substantially faster for analyzing the fairness of a decision tree, for example, but it can’t analyze models like neural networks. Other systems can analyze both neural networks and decision trees, but they tend to be slower and give inexact answers.”

    “SPPL shows that exact probabilistic inference is practical, not just theoretically possible, for a broad class of probabilistic programs,” says Vikash Mansinghka, an MIT principal research scientist and senior author on the paper. “In my lab, we’ve seen symbolic inference driving speed and accuracy improvements in other inference tasks that we previously approached via approximate Monte Carlo and deep learning algorithms. We’ve also been applying SPPL to probabilistic programs learned from real-world databases, to quantify the probability of rare events, generate synthetic proxy data given constraints, and automatically screen data for probable anomalies.”

    The new SPPL probabilistic programming language was presented in June at the ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI), in a paper that Saad co-authored with MIT EECS Professor Martin Rinard and Mansinghka. SPPL is implemented in Python and is available open source. More