More stories

  • in

    Living better with algorithms

    Laboratory for Information and Decision Systems (LIDS) student Sarah Cen remembers the lecture that sent her down the track to an upstream question.

    At a talk on ethical artificial intelligence, the speaker brought up a variation on the famous trolley problem, which outlines a philosophical choice between two undesirable outcomes.

    The speaker’s scenario: Say a self-driving car is traveling down a narrow alley with an elderly woman walking on one side and a small child on the other, and no way to thread between both without a fatality. Who should the car hit?

    Then the speaker said: Let’s take a step back. Is this the question we should even be asking?

    That’s when things clicked for Cen. Instead of considering the point of impact, a self-driving car could have avoided choosing between two bad outcomes by making a decision earlier on — the speaker pointed out that, when entering the alley, the car could have determined that the space was narrow and slowed to a speed that would keep everyone safe.

    Recognizing that today’s AI safety approaches often resemble the trolley problem, focusing on downstream regulation such as liability after someone is left with no good choices, Cen wondered: What if we could design better upstream and downstream safeguards to such problems? This question has informed much of Cen’s work.

    “Engineering systems are not divorced from the social systems on which they intervene,” Cen says. Ignoring this fact risks creating tools that fail to be useful when deployed or, more worryingly, that are harmful.

    Cen arrived at LIDS in 2018 via a slightly roundabout route. She first got a taste for research during her undergraduate degree at Princeton University, where she majored in mechanical engineering. For her master’s degree, she changed course, working on radar solutions in mobile robotics (primarily for self-driving cars) at Oxford University. There, she developed an interest in AI algorithms, curious about when and why they misbehave. So, she came to MIT and LIDS for her doctoral research, working with Professor Devavrat Shah in the Department of Electrical Engineering and Computer Science, for a stronger theoretical grounding in information systems.

    Auditing social media algorithms

    Together with Shah and other collaborators, Cen has worked on a wide range of projects during her time at LIDS, many of which tie directly to her interest in the interactions between humans and computational systems. In one such project, Cen studies options for regulating social media. Her recent work provides a method for translating human-readable regulations into implementable audits.

    To get a sense of what this means, suppose that regulators require that any public health content — for example, on vaccines — not be vastly different for politically left- and right-leaning users. How should auditors check that a social media platform complies with this regulation? Can a platform be made to comply with the regulation without damaging its bottom line? And how does compliance affect the actual content that users do see?

    Designing an auditing procedure is difficult in large part because there are so many stakeholders when it comes to social media. Auditors have to inspect the algorithm without accessing sensitive user data. They also have to work around tricky trade secrets, which can prevent them from getting a close look at the very algorithm that they are auditing because these algorithms are legally protected. Other considerations come into play as well, such as balancing the removal of misinformation with the protection of free speech.

    To meet these challenges, Cen and Shah developed an auditing procedure that does not need more than black-box access to the social media algorithm (which respects trade secrets), does not remove content (which avoids issues of censorship), and does not require access to users (which preserves users’ privacy).

    In their design process, the team also analyzed the properties of their auditing procedure, finding that it ensures a desirable property they call decision robustness. As good news for the platform, they show that a platform can pass the audit without sacrificing profits. Interestingly, they also found the audit naturally incentivizes the platform to show users diverse content, which is known to help reduce the spread of misinformation, counteract echo chambers, and more.

    Who gets good outcomes and who gets bad ones?

    In another line of research, Cen looks at whether people can receive good long-term outcomes when they not only compete for resources, but also don’t know upfront what resources are best for them.

    Some platforms, such as job-search platforms or ride-sharing apps, are part of what is called a matching market, which uses an algorithm to match one set of individuals (such as workers or riders) with another (such as employers or drivers). In many cases, individuals have matching preferences that they learn through trial and error. In labor markets, for example, workers learn their preferences about what kinds of jobs they want, and employers learn their preferences about the qualifications they seek from workers.

    But learning can be disrupted by competition. If workers with a particular background are repeatedly denied jobs in tech because of high competition for tech jobs, for instance, they may never get the knowledge they need to make an informed decision about whether they want to work in tech. Similarly, tech employers may never see and learn what these workers could do if they were hired.

    Cen’s work examines this interaction between learning and competition, studying whether it is possible for individuals on both sides of the matching market to walk away happy.

    Modeling such matching markets, Cen and Shah found that it is indeed possible to get to a stable outcome (workers aren’t incentivized to leave the matching market), with low regret (workers are happy with their long-term outcomes), fairness (happiness is evenly distributed), and high social welfare.

    Interestingly, it’s not obvious that it’s possible to get stability, low regret, fairness, and high social welfare simultaneously.  So another important aspect of the research was uncovering when it is possible to achieve all four criteria at once and exploring the implications of those conditions.

    What is the effect of X on Y?

    For the next few years, though, Cen plans to work on a new project, studying how to quantify the effect of an action X on an outcome Y when it’s expensive — or impossible — to measure this effect, focusing in particular on systems that have complex social behaviors.

    For instance, when Covid-19 cases surged in the pandemic, many cities had to decide what restrictions to adopt, such as mask mandates, business closures, or stay-home orders. They had to act fast and balance public health with community and business needs, public spending, and a host of other considerations.

    Typically, in order to estimate the effect of restrictions on the rate of infection, one might compare the rates of infection in areas that underwent different interventions. If one county has a mask mandate while its neighboring county does not, one might think comparing the counties’ infection rates would reveal the effectiveness of mask mandates. 

    But of course, no county exists in a vacuum. If, for instance, people from both counties gather to watch a football game in the maskless county every week, people from both counties mix. These complex interactions matter, and Sarah plans to study questions of cause and effect in such settings.

    “We’re interested in how decisions or interventions affect an outcome of interest, such as how criminal justice reform affects incarceration rates or how an ad campaign might change the public’s behaviors,” Cen says.

    Cen has also applied the principles of promoting inclusivity to her work in the MIT community.

    As one of three co-presidents of the Graduate Women in MIT EECS student group, she helped organize the inaugural GW6 research summit featuring the research of women graduate students — not only to showcase positive role models to students, but also to highlight the many successful graduate women at MIT who are not to be underestimated.

    Whether in computing or in the community, a system taking steps to address bias is one that enjoys legitimacy and trust, Cen says. “Accountability, legitimacy, trust — these principles play crucial roles in society and, ultimately, will determine which systems endure with time.”  More

  • in

    On the road to cleaner, greener, and faster driving

    No one likes sitting at a red light. But signalized intersections aren’t just a minor nuisance for drivers; vehicles consume fuel and emit greenhouse gases while waiting for the light to change.

    What if motorists could time their trips so they arrive at the intersection when the light is green? While that might be just a lucky break for a human driver, it could be achieved more consistently by an autonomous vehicle that uses artificial intelligence to control its speed.

    In a new study, MIT researchers demonstrate a machine-learning approach that can learn to control a fleet of autonomous vehicles as they approach and travel through a signalized intersection in a way that keeps traffic flowing smoothly.

    Using simulations, they found that their approach reduces fuel consumption and emissions while improving average vehicle speed. The technique gets the best results if all cars on the road are autonomous, but even if only 25 percent use their control algorithm, it still leads to substantial fuel and emissions benefits.

    “This is a really interesting place to intervene. No one’s life is better because they were stuck at an intersection. With a lot of other climate change interventions, there is a quality-of-life difference that is expected, so there is a barrier to entry there. Here, the barrier is much lower,” says senior author Cathy Wu, the Gilbert W. Winslow Career Development Assistant Professor in the Department of Civil and Environmental Engineering and a member of the Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems (LIDS).

    The lead author of the study is Vindula Jayawardana, a graduate student in LIDS and the Department of Electrical Engineering and Computer Science. The research will be presented at the European Control Conference.

    Intersection intricacies

    While humans may drive past a green light without giving it much thought, intersections can present billions of different scenarios depending on the number of lanes, how the signals operate, the number of vehicles and their speeds, the presence of pedestrians and cyclists, etc.

    Typical approaches for tackling intersection control problems use mathematical models to solve one simple, ideal intersection. That looks good on paper, but likely won’t hold up in the real world, where traffic patterns are often about as messy as they come.

    Wu and Jayawardana shifted gears and approached the problem using a model-free technique known as deep reinforcement learning. Reinforcement learning is a trial-and-error method where the control algorithm learns to make a sequence of decisions. It is rewarded when it finds a good sequence. With deep reinforcement learning, the algorithm leverages assumptions learned by a neural network to find shortcuts to good sequences, even if there are billions of possibilities.

    This is useful for solving a long-horizon problem like this; the control algorithm must issue upwards of 500 acceleration instructions to a vehicle over an extended time period, Wu explains.

    “And we have to get the sequence right before we know that we have done a good job of mitigating emissions and getting to the intersection at a good speed,” she adds.

    But there’s an additional wrinkle. The researchers want the system to learn a strategy that reduces fuel consumption and limits the impact on travel time. These goals can be conflicting.

    “To reduce travel time, we want the car to go fast, but to reduce emissions, we want the car to slow down or not move at all. Those competing rewards can be very confusing to the learning agent,” Wu says.

    While it is challenging to solve this problem in its full generality, the researchers employed a workaround using a technique known as reward shaping. With reward shaping, they give the system some domain knowledge it is unable to learn on its own. In this case, they penalized the system whenever the vehicle came to a complete stop, so it would learn to avoid that action.

    Traffic tests

    Once they developed an effective control algorithm, they evaluated it using a traffic simulation platform with a single intersection. The control algorithm is applied to a fleet of connected autonomous vehicles, which can communicate with upcoming traffic lights to receive signal phase and timing information and observe their immediate surroundings. The control algorithm tells each vehicle how to accelerate and decelerate.

    Their system didn’t create any stop-and-go traffic as vehicles approached the intersection. (Stop-and-go traffic occurs when cars are forced to come to a complete stop due to stopped traffic ahead). In simulations, more cars made it through in a single green phase, which outperformed a model that simulates human drivers. When compared to other optimization methods also designed to avoid stop-and-go traffic, their technique resulted in larger fuel consumption and emissions reductions. If every vehicle on the road is autonomous, their control system can reduce fuel consumption by 18 percent and carbon dioxide emissions by 25 percent, while boosting travel speeds by 20 percent.

    “A single intervention having 20 to 25 percent reduction in fuel or emissions is really incredible. But what I find interesting, and was really hoping to see, is this non-linear scaling. If we only control 25 percent of vehicles, that gives us 50 percent of the benefits in terms of fuel and emissions reduction. That means we don’t have to wait until we get to 100 percent autonomous vehicles to get benefits from this approach,” she says.

    Down the road, the researchers want to study interaction effects between multiple intersections. They also plan to explore how different intersection set-ups (number of lanes, signals, timings, etc.) can influence travel time, emissions, and fuel consumption. In addition, they intend to study how their control system could impact safety when autonomous vehicles and human drivers share the road. For instance, even though autonomous vehicles may drive differently than human drivers, slower roadways and roadways with more consistent speeds could improve safety, Wu says.

    While this work is still in its early stages, Wu sees this approach as one that could be more feasibly implemented in the near-term.

    “The aim in this work is to move the needle in sustainable mobility. We want to dream, as well, but these systems are big monsters of inertia. Identifying points of intervention that are small changes to the system but have significant impact is something that gets me up in the morning,” she says.  

    This work was supported, in part, by the MIT-IBM Watson AI Lab. More

  • in

    System helps severely motor-impaired individuals type more quickly and accurately

    In 1995, French fashion magazine editor Jean-Dominique Bauby suffered a seizure while driving a car, which left him with a condition known as locked-in syndrome, a neurological disease in which the patient is completely paralyzed and can only move muscles that control the eyes.

    Bauby, who had signed a book contract shortly before his accident, wrote the memoir “The Diving Bell and the Butterfly” using a dictation system in which his speech therapist recited the alphabet and he would blink when she said the correct letter. They wrote the 130-page book one blink at a time.

    Technology has come a long way since Bauby’s accident. Many individuals with severe motor impairments caused by locked-in syndrome, cerebral palsy, amyotrophic lateral sclerosis, or other conditions can communicate using computer interfaces where they select letters or words in an onscreen grid by activating a single switch, often by pressing a button, releasing a puff of air, or blinking.

    But these row-column scanning systems are very rigid, and, similar to the technique used by Bauby’s speech therapist, they highlight each option one at a time, making them frustratingly slow for some users. And they are not suitable for tasks where options can’t be arranged in a grid, like drawing, browsing the web, or gaming.

    A more flexible system being developed by researchers at MIT places individual selection indicators next to each option on a computer screen. The indicators can be placed anywhere — next to anything someone might click with a mouse — so a user does not need to cycle through a grid of choices to make selections. The system, called Nomon, incorporates probabilistic reasoning to learn how users make selections, and then adjusts the interface to improve their speed and accuracy.

    Participants in a user study were able to type faster using Nomon than with a row-column scanning system. The users also performed better on a picture selection task, demonstrating how Nomon could be used for more than typing.

    “It is so cool and exciting to be able to develop software that has the potential to really help people. Being able to find those signals and turn them into communication as we are used to it is a really interesting problem,” says senior author Tamara Broderick, an associate professor in the MIT Department of Electrical Engineering and Computer Science (EECS) and a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society.

    Joining Broderick on the paper are lead author Nicholas Bonaker, an EECS graduate student; Emli-Mari Nel, head of innovation and machine learning at Averly and a visiting lecturer at the University of Witwatersrand in South Africa; and Keith Vertanen, an associate professor at Michigan Tech. The research is being presented at the ACM Conference on Human Factors in Computing Systems.

    On the clock

    In the Nomon interface, a small analog clock is placed next to every option the user can select. (A gnomon is the part of a sundial that casts a shadow.) The user looks at one option and then clicks their switch when that clock’s hand passes a red “noon” line. After each click, the system changes the phases of the clocks to separate the most probable next targets. The user clicks repeatedly until their target is selected.

    When used as a keyboard, Nomon’s machine-learning algorithms try to guess the next word based on previous words and each new letter as the user makes selections.

    Broderick developed a simplified version of Nomon several years ago but decided to revisit it to make the system easier for motor-impaired individuals to use. She enlisted the help of then-undergraduate Bonaker to redesign the interface.

    They first consulted nonprofit organizations that work with motor-impaired individuals, as well as a motor-impaired switch user, to gather feedback on the Nomon design.

    Then they designed a user study that would better represent the abilities of motor-impaired individuals. They wanted to make sure to thoroughly vet the system before using much of the valuable time of motor-impaired users, so they first tested on non-switch users, Broderick explains.

    Switching up the switch

    To gather more representative data, Bonaker devised a webcam-based switch that was harder to use than simply clicking a key. The non-switch users had to lean their bodies to one side of the screen and then back to the other side to register a click.

    “And they have to do this at precisely the right time, so it really slows them down. We did some empirical studies which showed that they were much closer to the response times of motor-impaired individuals,” Broderick says.

    They ran a 10-session user study with 13 non-switch participants and one single-switch user with an advanced form of spinal muscular dystrophy. In the first nine sessions, participants used Nomon and a row-column scanning interface for 20 minutes each to perform text entry, and in the 10th session they used the two systems for a picture selection task.

    Non-switch users typed 15 percent faster using Nomon, while the motor-impaired user typed even faster than the non-switch users. When typing unfamiliar words, the users were 20 percent faster overall and made half as many errors. In their final session, they were able to complete the picture selection task 36 percent faster using Nomon.

    “Nomon is much more forgiving than row-column scanning. With row-column scanning, even if you are just slightly off, now you’ve chosen B instead of A and that’s an error,” Broderick says.

    Adapting to noisy clicks

    With its probabilistic reasoning, Nomon incorporates everything it knows about where a user is likely to click to make the process faster, easier, and less error-prone. For instance, if the user selects “Q,” Nomon will make it as easy as possible for the user to select “U” next.

    Nomon also learns how a user clicks. So, if the user always clicks a little after the clock’s hand strikes noon, the system adapts to that in real time. It also adapts to noisiness. If a user’s click is often off the mark, the system requires extra clicks to ensure accuracy.

    This probabilistic reasoning makes Nomon powerful but also requires a higher click-load than row-column scanning systems. Clicking multiple times can be a trying task for severely motor-impaired users.

    Broderick hopes to reduce the click-load by incorporating gaze tracking into Nomon, which would give the system more robust information about what a user might choose next based on which part of the screen they are looking at. The researchers also want to find a better way to automatically adjust the clock speeds to help users be more accurate and efficient.

    They are working on a new series of studies in which they plan to partner with more motor-impaired users.

    “So far, the feedback from motor-impaired users has been invaluable to us; we’re very grateful to the motor-impaired user who commented on our initial interface and the separate motor-impaired user who participated in our study. We’re currently extending our study to work with a bigger and more diverse group of our target population. With their help, we’re already making further improvements to our interface and working to better understand the performance of Nomon,” she says.

    “Nonspeaking individuals with motor disabilities are currently not provided with efficient communication solutions for interacting with either speaking partners or computer systems. This ‘communication gap’ is a known unresolved problem in human-computer interaction, and so far there are no good solutions. This paper demonstrates that a highly creative approach underpinned by a statistical model can provide tangible performance gains to the users who need it the most: nonspeaking individuals reliant on a single switch to communicate,” says Per Ola Kristensson, professor of interactive systems engineering at Cambridge University, who was not involved with this research. “The paper also demonstrates the value of complementing insights from computational experiments with the involvement of end-users and other stakeholders in the design process. I find this a highly creative and important paper in an area where it is notoriously difficult to make significant progress.”

    This research was supported, in part, by the Seth Teller Memorial Fund to Advanced Technology for People with Disabilities, a Peter J. Eloranta Summer Undergraduate Research Fellowship, the MIT Quest for Intelligence, and the National Science Foundation. More

  • in

    A tool for predicting the future

    Whether someone is trying to predict tomorrow’s weather, forecast future stock prices, identify missed opportunities for sales in retail, or estimate a patient’s risk of developing a disease, they will likely need to interpret time-series data, which are a collection of observations recorded over time.

    Making predictions using time-series data typically requires several data-processing steps and the use of complex machine-learning algorithms, which have such a steep learning curve they aren’t readily accessible to nonexperts.

    To make these powerful tools more user-friendly, MIT researchers developed a system that directly integrates prediction functionality on top of an existing time-series database. Their simplified interface, which they call tspDB (time series predict database), does all the complex modeling behind the scenes so a nonexpert can easily generate a prediction in only a few seconds.

    The new system is more accurate and more efficient than state-of-the-art deep learning methods when performing two tasks: predicting future values and filling in missing data points.

    One reason tspDB is so successful is that it incorporates a novel time-series-prediction algorithm, explains electrical engineering and computer science (EECS) graduate student Abdullah Alomar, an author of a recent research paper in which he and his co-authors describe the algorithm. This algorithm is especially effective at making predictions on multivariate time-series data, which are data that have more than one time-dependent variable. In a weather database, for instance, temperature, dew point, and cloud cover each depend on their past values.

    The algorithm also estimates the volatility of a multivariate time series to provide the user with a confidence level for its predictions.

    “Even as the time-series data becomes more and more complex, this algorithm can effectively capture any time-series structure out there. It feels like we have found the right lens to look at the model complexity of time-series data,” says senior author Devavrat Shah, the Andrew and Erna Viterbi Professor in EECS and a member of the Institute for Data, Systems, and Society and of the Laboratory for Information and Decision Systems.

    Joining Alomar and Shah on the paper is lead author Anish Agrawal, a former EECS graduate student who is currently a postdoc at the Simons Institute at the University of California at Berkeley. The research will be presented at the ACM SIGMETRICS conference.

    Adapting a new algorithm

    Shah and his collaborators have been working on the problem of interpreting time-series data for years, adapting different algorithms and integrating them into tspDB as they built the interface.

    About four years ago, they learned about a particularly powerful classical algorithm, called singular spectrum analysis (SSA), that imputes and forecasts single time series. Imputation is the process of replacing missing values or correcting past values. While this algorithm required manual parameter selection, the researchers suspected it could enable their interface to make effective predictions using time series data. In earlier work, they removed this need to manually intervene for algorithmic implementation.  

    The algorithm for single time series transformed it into a matrix and utilized matrix estimation procedures. The key intellectual challenge was how to adapt it to utilize multiple time series.  After a few years of struggle, they realized the answer was something very simple: “Stack” the matrices for each individual time series, treat it as a one big matrix, and then apply the single time-series algorithm on it.

    This utilizes information across multiple time series naturally — both across the time series and across time, which they describe in their new paper.

    This recent publication also discusses interesting alternatives, where instead of transforming the multivariate time series into a big matrix, it is viewed as a three-dimensional tensor. A tensor is a multi-dimensional array, or grid, of numbers. This established a promising connection between the classical field of time series analysis and the growing field of tensor estimation, Alomar says.

    “The variant of mSSA that we introduced actually captures all of that beautifully. So, not only does it provide the most likely estimation, but a time-varying confidence interval, as well,” Shah says.

    The simpler, the better

    They tested the adapted mSSA against other state-of-the-art algorithms, including deep-learning methods, on real-world time-series datasets with inputs drawn from the electricity grid, traffic patterns, and financial markets.

    Their algorithm outperformed all the others on imputation and it outperformed all but one of the other algorithms when it came to forecasting future values. The researchers also demonstrated that their tweaked version of mSSA can be applied to any kind of time-series data.

    “One reason I think this works so well is that the model captures a lot of time series dynamics, but at the end of the day, it is still a simple model. When you are working with something simple like this, instead of a neural network that can easily overfit the data, you can actually perform better,” Alomar says.

    The impressive performance of mSSA is what makes tspDB so effective, Shah explains. Now, their goal is to make this algorithm accessible to everyone.

    One a user installs tspDB on top of an existing database, they can run a prediction query with just a few keystrokes in about 0.9 milliseconds, as compared to 0.5 milliseconds for a standard search query. The confidence intervals are also designed to help nonexperts to make a more informed decision by incorporating the degree of uncertainty of the predictions into their decision making.

    For instance, the system could enable a nonexpert to predict future stock prices with high accuracy in just a few minutes, even if the time-series dataset contains missing values.

    Now that the researchers have shown why mSSA works so well, they are targeting new algorithms that can be incorporated into tspDB. One of these algorithms utilizes the same model to automatically enable change point detection, so if the user believes their time series will change its behavior at some point, the system will automatically detect that change and incorporate that into its predictions.

    They also want to continue gathering feedback from current tspDB users to see how they can improve the system’s functionality and user-friendliness, Shah says.

    “Our interest at the highest level is to make tspDB a success in the form of a broadly utilizable, open-source system. Time-series data are very important, and this is a beautiful concept of actually building prediction functionalities directly into the database. It has never been done before, and so we want to make sure the world uses it,” he says.

    “This work is very interesting for a number of reasons. It provides a practical variant of mSSA which requires no hand tuning, they provide the first known analysis of mSSA, and the authors demonstrate the real-world value of their algorithm by being competitive with or out-performing several known algorithms for imputations and predictions in (multivariate) time series for several real-world data sets,” says Vishal Misra, a professor of computer science at Columbia University who was not involved with this research. “At the heart of it all is the beautiful modeling work where they cleverly exploit correlations across time (within a time series) and space (across time series) to create a low-rank spatiotemporal factor representation of a multivariate time series. Importantly this model connects the field of time series analysis to that of the rapidly evolving topic of tensor completion, and I expect a lot of follow-on research spurred by this paper.” More

  • in

    Systems scientists find clues to why false news snowballs on social media

    The spread of misinformation on social media is a pressing societal problem that tech companies and policymakers continue to grapple with, yet those who study this issue still don’t have a deep understanding of why and how false news spreads.

    To shed some light on this murky topic, researchers at MIT developed a theoretical model of a Twitter-like social network to study how news is shared and explore situations where a non-credible news item will spread more widely than the truth. Agents in the model are driven by a desire to persuade others to take on their point of view: The key assumption in the model is that people bother to share something with their followers if they think it is persuasive and likely to move others closer to their mindset. Otherwise they won’t share.

    The researchers found that in such a setting, when a network is highly connected or the views of its members are sharply polarized, news that is likely to be false will spread more widely and travel deeper into the network than news with higher credibility.

    This theoretical work could inform empirical studies of the relationship between news credibility and the size of its spread, which might help social media companies adapt networks to limit the spread of false information.

    “We show that, even if people are rational in how they decide to share the news, this could still lead to the amplification of information with low credibility. With this persuasion motive, no matter how extreme my beliefs are — given that the more extreme they are the more I gain by moving others’ opinions — there is always someone who would amplify [the information],” says senior author Ali Jadbabaie, professor and head of the Department of Civil and Environmental Engineering and a core faculty member of the Institute for Data, Systems, and Society (IDSS) and a principal investigator in the Laboratory for Information and Decision Systems (LIDS).

    Joining Jadbabaie on the paper are first author Chin-Chia Hsu, a graduate student in the Social and Engineering Systems program in IDSS, and Amir Ajorlou, a LIDS research scientist. The research will be presented this week at the IEEE Conference on Decision and Control.

    Pondering persuasion

    This research draws on a 2018 study by Sinan Aral, the David Austin Professor of Management at the MIT Sloan School of Management; Deb Roy, an associate professor of media arts and sciences at the Media Lab; and former postdoc Soroush Vosoughi (now an assistant professor of computer science at Dartmouth University). Their empirical study of data from Twitter found that false news spreads wider, faster, and deeper than real news.

    Jadbabaie and his collaborators wanted to drill down on why this occurs.

    They hypothesized that persuasion might be a strong motive for sharing news — perhaps agents in the network want to persuade others to take on their point of view — and decided to build a theoretical model that would let them explore this possibility.

    In their model, agents have some prior belief about a policy, and their goal is to persuade followers to move their beliefs closer to the agent’s side of the spectrum.

    A news item is initially released to a small, random subgroup of agents, which must decide whether to share this news with their followers. An agent weighs the newsworthiness of the item and its credibility, and updates its belief based on how surprising or convincing the news is. 

    “They will make a cost-benefit analysis to see if, on average, this piece of news will move people closer to what they think or move them away. And we include a nominal cost for sharing. For instance, taking some action, if you are scrolling on social media, you have to stop to do that. Think of that as a cost. Or a reputation cost might come if I share something that is embarrassing. Everyone has this cost, so the more extreme and the more interesting the news is, the more you want to share it,” Jadbabaie says.

    If the news affirms the agent’s perspective and has persuasive power that outweighs the nominal cost, the agent will always share the news. But if an agent thinks the news item is something others may have already seen, the agent is disincentivized to share it.

    Since an agent’s willingness to share news is a product of its perspective and how persuasive the news is, the more extreme an agent’s perspective or the more surprising the news, the more likely the agent will share it.

    The researchers used this model to study how information spreads during a news cascade, which is an unbroken sharing chain that rapidly permeates the network.

    Connectivity and polarization

    The team found that when a network has high connectivity and the news is surprising, the credibility threshold for starting a news cascade is lower. High connectivity means that there are multiple connections between many users in the network.

    Likewise, when the network is largely polarized, there are plenty of agents with extreme views who want to share the news item, starting a news cascade. In both these instances, news with low credibility creates the largest cascades.

    “For any piece of news, there is a natural network speed limit, a range of connectivity, that facilitates good transmission of information where the size of the cascade is maximized by true news. But if you exceed that speed limit, you will get into situations where inaccurate news or news with low credibility has a larger cascade size,” Jadbabaie says.

    If the views of users in the network become more diverse, it is less likely that a poorly credible piece of news will spread more widely than the truth.

    Jadbabaie and his colleagues designed the agents in the network to behave rationally, so the model would better capture actions real humans might take if they want to persuade others.

    “Someone might say that is not why people share, and that is valid. Why people do certain things is a subject of intense debate in cognitive science, social psychology, neuroscience, economics, and political science,” he says. “Depending on your assumptions, you end up getting different results. But I feel like this assumption of persuasion being the motive is a natural assumption.”

    Their model also shows how costs can be manipulated to reduce the spread of false information. Agents make a cost-benefit analysis and won’t share news if the cost to do so outweighs the benefit of sharing.

    “We don’t make any policy prescriptions, but one thing this work suggests is that, perhaps, having some cost associated with sharing news is not a bad idea. The reason you get lots of these cascades is because the cost of sharing the news is actually very low,” he says.

    This work was supported by an Army Research Office Multidisciplinary University Research Initiative grant and a Vannevar Bush Fellowship from the Office of the Secretary of Defense. More

  • in

    Machine learning speeds up vehicle routing

    Waiting for a holiday package to be delivered? There’s a tricky math problem that needs to be solved before the delivery truck pulls up to your door, and MIT researchers have a strategy that could speed up the solution.

    The approach applies to vehicle routing problems such as last-mile delivery, where the goal is to deliver goods from a central depot to multiple cities while keeping travel costs down. While there are algorithms designed to solve this problem for a few hundred cities, these solutions become too slow when applied to a larger set of cities.

    To remedy this, Cathy Wu, the Gilbert W. Winslow Career Development Assistant Professor in Civil and Environmental Engineering and the Institute for Data, Systems, and Society, and her students have come up with a machine-learning strategy that accelerates some of the strongest algorithmic solvers by 10 to 100 times.

    The solver algorithms work by breaking up the problem of delivery into smaller subproblems to solve — say, 200 subproblems for routing vehicles between 2,000 cities. Wu and her colleagues augment this process with a new machine-learning algorithm that identifies the most useful subproblems to solve, instead of solving all the subproblems, to increase the quality of the solution while using orders of magnitude less compute.

    Their approach, which they call “learning-to-delegate,” can be used across a variety of solvers and a variety of similar problems, including scheduling and pathfinding for warehouse robots, the researchers say.

    The work pushes the boundaries on rapidly solving large-scale vehicle routing problems, says Marc Kuo, founder and CEO of Routific, a smart logistics platform for optimizing delivery routes. Some of Routific’s recent algorithmic advances were inspired by Wu’s work, he notes.

    “Most of the academic body of research tends to focus on specialized algorithms for small problems, trying to find better solutions at the cost of processing times. But in the real-world, businesses don’t care about finding better solutions, especially if they take too long for compute,” Kuo explains. “In the world of last-mile logistics, time is money, and you cannot have your entire warehouse operations wait for a slow algorithm to return the routes. An algorithm needs to be hyper-fast for it to be practical.”

    Wu, social and engineering systems doctoral student Sirui Li, and electrical engineering and computer science doctoral student Zhongxia Yan presented their research this week at the 2021 NeurIPS conference.

    Selecting good problems

    Vehicle routing problems are a class of combinatorial problems, which involve using heuristic algorithms to find “good-enough solutions” to the problem. It’s typically not possible to come up with the one “best” answer to these problems, because the number of possible solutions is far too huge.

    “The name of the game for these types of problems is to design efficient algorithms … that are optimal within some factor,” Wu explains. “But the goal is not to find optimal solutions. That’s too hard. Rather, we want to find as good of solutions as possible. Even a 0.5% improvement in solutions can translate to a huge revenue increase for a company.”

    Over the past several decades, researchers have developed a variety of heuristics to yield quick solutions to combinatorial problems. They usually do this by starting with a poor but valid initial solution and then gradually improving the solution — by trying small tweaks to improve the routing between nearby cities, for example. For a large problem like a 2,000-plus city routing challenge, however, this approach just takes too much time.

    More recently, machine-learning methods have been developed to solve the problem, but while faster, they tend to be more inaccurate, even at the scale of a few dozen cities. Wu and her colleagues decided to see if there was a beneficial way to combine the two methods to find speedy but high-quality solutions.

    “For us, this is where machine learning comes in,” Wu says. “Can we predict which of these subproblems, that if we were to solve them, would lead to more improvement in the solution, saving computing time and expense?”

    Traditionally, a large-scale vehicle routing problem heuristic might choose the subproblems to solve in which order either randomly or by applying yet another carefully devised heuristic. In this case, the MIT researchers ran sets of subproblems through a neural network they created to automatically find the subproblems that, when solved, would lead to the greatest gain in quality of the solutions. This process sped up subproblem selection process by 1.5 to 2 times, Wu and colleagues found.

    “We don’t know why these subproblems are better than other subproblems,” Wu notes. “It’s actually an interesting line of future work. If we did have some insights here, these could lead to designing even better algorithms.”

    Surprising speed-up

    Wu and colleagues were surprised by how well the approach worked. In machine learning, the idea of garbage-in, garbage-out applies — that is, the quality of a machine-learning approach relies heavily on the quality of the data. A combinatorial problem is so difficult that even its subproblems can’t be optimally solved. A neural network trained on the “medium-quality” subproblem solutions available as the input data “would typically give medium-quality results,” says Wu. In this case, however, the researchers were able to leverage the medium-quality solutions to achieve high-quality results, significantly faster than state-of-the-art methods.

    For vehicle routing and similar problems, users often must design very specialized algorithms to solve their specific problem. Some of these heuristics have been in development for decades.

    The learning-to-delegate method offers an automatic way to accelerate these heuristics for large problems, no matter what the heuristic or — potentially — what the problem.

    Since the method can work with a variety of solvers, it may be useful for a variety of resource allocation problems, says Wu. “We may unlock new applications that now will be possible because the cost of solving the problem is 10 to 100 times less.”

    The research was supported by MIT Indonesia Seed Fund, U.S. Department of Transportation Dwight David Eisenhower Transportation Fellowship Program, and the MIT-IBM Watson AI Lab. More

  • in

    Avoiding shortcut solutions in artificial intelligence

    If your Uber driver takes a shortcut, you might get to your destination faster. But if a machine learning model takes a shortcut, it might fail in unexpected ways.

    In machine learning, a shortcut solution occurs when the model relies on a simple characteristic of a dataset to make a decision, rather than learning the true essence of the data, which can lead to inaccurate predictions. For example, a model might learn to identify images of cows by focusing on the green grass that appears in the photos, rather than the more complex shapes and patterns of the cows.  

    A new study by researchers at MIT explores the problem of shortcuts in a popular machine-learning method and proposes a solution that can prevent shortcuts by forcing the model to use more data in its decision-making.

    By removing the simpler characteristics the model is focusing on, the researchers force it to focus on more complex features of the data that it hadn’t been considering. Then, by asking the model to solve the same task two ways — once using those simpler features, and then also using the complex features it has now learned to identify — they reduce the tendency for shortcut solutions and boost the performance of the model.

    One potential application of this work is to enhance the effectiveness of machine learning models that are used to identify disease in medical images. Shortcut solutions in this context could lead to false diagnoses and have dangerous implications for patients.

    “It is still difficult to tell why deep networks make the decisions that they do, and in particular, which parts of the data these networks choose to focus upon when making a decision. If we can understand how shortcuts work in further detail, we can go even farther to answer some of the fundamental but very practical questions that are really important to people who are trying to deploy these networks,” says Joshua Robinson, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper.

    Robinson wrote the paper with his advisors, senior author Suvrit Sra, the Esther and Harold E. Edgerton Career Development Associate Professor in the Department of Electrical Engineering and Computer Science (EECS) and a core member of the Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems; and Stefanie Jegelka, the X-Consortium Career Development Associate Professor in EECS and a member of CSAIL and IDSS; as well as University of Pittsburgh assistant professor Kayhan Batmanghelich and PhD students Li Sun and Ke Yu. The research will be presented at the Conference on Neural Information Processing Systems in December. 

    The long road to understanding shortcuts

    The researchers focused their study on contrastive learning, which is a powerful form of self-supervised machine learning. In self-supervised machine learning, a model is trained using raw data that do not have label descriptions from humans. It can therefore be used successfully for a larger variety of data.

    A self-supervised learning model learns useful representations of data, which are used as inputs for different tasks, like image classification. But if the model takes shortcuts and fails to capture important information, these tasks won’t be able to use that information either.

    For example, if a self-supervised learning model is trained to classify pneumonia in X-rays from a number of hospitals, but it learns to make predictions based on a tag that identifies the hospital the scan came from (because some hospitals have more pneumonia cases than others), the model won’t perform well when it is given data from a new hospital.     

    For contrastive learning models, an encoder algorithm is trained to discriminate between pairs of similar inputs and pairs of dissimilar inputs. This process encodes rich and complex data, like images, in a way that the contrastive learning model can interpret.

    The researchers tested contrastive learning encoders with a series of images and found that, during this training procedure, they also fall prey to shortcut solutions. The encoders tend to focus on the simplest features of an image to decide which pairs of inputs are similar and which are dissimilar. Ideally, the encoder should focus on all the useful characteristics of the data when making a decision, Jegelka says.

    So, the team made it harder to tell the difference between the similar and dissimilar pairs, and found that this changes which features the encoder will look at to make a decision.

    “If you make the task of discriminating between similar and dissimilar items harder and harder, then your system is forced to learn more meaningful information in the data, because without learning that it cannot solve the task,” she says.

    But increasing this difficulty resulted in a tradeoff — the encoder got better at focusing on some features of the data but became worse at focusing on others. It almost seemed to forget the simpler features, Robinson says.

    To avoid this tradeoff, the researchers asked the encoder to discriminate between the pairs the same way it had originally, using the simpler features, and also after the researchers removed the information it had already learned. Solving the task both ways simultaneously caused the encoder to improve across all features.

    Their method, called implicit feature modification, adaptively modifies samples to remove the simpler features the encoder is using to discriminate between the pairs. The technique does not rely on human input, which is important because real-world data sets can have hundreds of different features that could combine in complex ways, Sra explains.

    From cars to COPD

    The researchers ran one test of this method using images of vehicles. They used implicit feature modification to adjust the color, orientation, and vehicle type to make it harder for the encoder to discriminate between similar and dissimilar pairs of images. The encoder improved its accuracy across all three features — texture, shape, and color — simultaneously.

    To see if the method would stand up to more complex data, the researchers also tested it with samples from a medical image database of chronic obstructive pulmonary disease (COPD). Again, the method led to simultaneous improvements across all features they evaluated.

    While this work takes some important steps forward in understanding the causes of shortcut solutions and working to solve them, the researchers say that continuing to refine these methods and applying them to other types of self-supervised learning will be key to future advancements.

    “This ties into some of the biggest questions about deep learning systems, like ‘Why do they fail?’ and ‘Can we know in advance the situations where your model will fail?’ There is still a lot farther to go if you want to understand shortcut learning in its full generality,” Robinson says.

    This research is supported by the National Science Foundation, National Institutes of Health, and the Pennsylvania Department of Health’s SAP SE Commonwealth Universal Research Enhancement (CURE) program. More

  • in

    Making machine learning more useful to high-stakes decision makers

    The U.S. Centers for Disease Control and Prevention estimates that one in seven children in the United States experienced abuse or neglect in the past year. Child protective services agencies around the nation receive a high number of reports each year (about 4.4 million in 2019) of alleged neglect or abuse. With so many cases, some agencies are implementing machine learning models to help child welfare specialists screen cases and determine which to recommend for further investigation.

    But these models don’t do any good if the humans they are intended to help don’t understand or trust their outputs.

    Researchers at MIT and elsewhere launched a research project to identify and tackle machine learning usability challenges in child welfare screening. In collaboration with a child welfare department in Colorado, the researchers studied how call screeners assess cases, with and without the help of machine learning predictions. Based on feedback from the call screeners, they designed a visual analytics tool that uses bar graphs to show how specific factors of a case contribute to the predicted risk that a child will be removed from their home within two years.

    The researchers found that screeners are more interested in seeing how each factor, like the child’s age, influences a prediction, rather than understanding the computational basis of how the model works. Their results also show that even a simple model can cause confusion if its features are not described with straightforward language.

    These findings could be applied to other high-risk fields where humans use machine learning models to help them make decisions, but lack data science experience, says senior author Kalyan Veeramachaneni, principal research scientist in the Laboratory for Information and Decision Systems (LIDS) and senior author of the paper.

    “Researchers who study explainable AI, they often try to dig deeper into the model itself to explain what the model did. But a big takeaway from this project is that these domain experts don’t necessarily want to learn what machine learning actually does. They are more interested in understanding why the model is making a different prediction than what their intuition is saying, or what factors it is using to make this prediction. They want information that helps them reconcile their agreements or disagreements with the model, or confirms their intuition,” he says.

    Co-authors include electrical engineering and computer science PhD student Alexandra Zytek, who is the lead author; postdoc Dongyu Liu; and Rhema Vaithianathan, professor of economics and director of the Center for Social Data Analytics at the Auckland University of Technology and professor of social data analytics at the University of Queensland. The research will be presented later this month at the IEEE Visualization Conference.

    Real-world research

    The researchers began the study more than two years ago by identifying seven factors that make a machine learning model less usable, including lack of trust in where predictions come from and disagreements between user opinions and the model’s output.

    With these factors in mind, Zytek and Liu flew to Colorado in the winter of 2019 to learn firsthand from call screeners in a child welfare department. This department is implementing a machine learning system developed by Vaithianathan that generates a risk score for each report, predicting the likelihood the child will be removed from their home. That risk score is based on more than 100 demographic and historic factors, such as the parents’ ages and past court involvements.

    “As you can imagine, just getting a number between one and 20 and being told to integrate this into your workflow can be a bit challenging,” Zytek says.

    They observed how teams of screeners process cases in about 10 minutes and spend most of that time discussing the risk factors associated with the case. That inspired the researchers to develop a case-specific details interface, which shows how each factor influenced the overall risk score using color-coded, horizontal bar graphs that indicate the magnitude of the contribution in a positive or negative direction.

    Based on observations and detailed interviews, the researchers built four additional interfaces that provide explanations of the model, including one that compares a current case to past cases with similar risk scores. Then they ran a series of user studies.

    The studies revealed that more than 90 percent of the screeners found the case-specific details interface to be useful, and it generally increased their trust in the model’s predictions. On the other hand, the screeners did not like the case comparison interface. While the researchers thought this interface would increase trust in the model, screeners were concerned it could lead to decisions based on past cases rather than the current report.   

    “The most interesting result to me was that, the features we showed them — the information that the model uses — had to be really interpretable to start. The model uses more than 100 different features in order to make its prediction, and a lot of those were a bit confusing,” Zytek says.

    Keeping the screeners in the loop throughout the iterative process helped the researchers make decisions about what elements to include in the machine learning explanation tool, called Sibyl.

    As they refined the Sibyl interfaces, the researchers were careful to consider how providing explanations could contribute to some cognitive biases, and even undermine screeners’ trust in the model.

    For instance, since explanations are based on averages in a database of child abuse and neglect cases, having three past abuse referrals may actually decrease the risk score of a child, since averages in this database may be far higher. A screener may see that explanation and decide not to trust the model, even though it is working correctly, Zytek explains. And because humans tend to put more emphasis on recent information, the order in which the factors are listed could also influence decisions.

    Improving interpretability

    Based on feedback from call screeners, the researchers are working to tweak the explanation model so the features that it uses are easier to explain.

    Moving forward, they plan to enhance the interfaces they’ve created based on additional feedback and then run a quantitative user study to track the effects on decision making with real cases. Once those evaluations are complete, they can prepare to deploy Sibyl, Zytek says.

    “It was especially valuable to be able to work so actively with these screeners. We got to really understand the problems they faced. While we saw some reservations on their part, what we saw more of was excitement about how useful these explanations were in certain cases. That was really rewarding,” she says.

    This work is supported, in part, by the National Science Foundation. More