More stories

  • in

    3 Questions: Designing software for research ethics

    Data are arguably the world’s hottest form of currency, clocking in zeros and ones that hold ever more weight than before. But with all of our personal information being crunched into dynamite for enterprise solutions and the like, with a lack of consumer data protection, are we all getting left behind? 

    Jonathan Zong, a PhD candidate in electrical engineering and computer science at MIT, and an affiliate of the Computer Science and Artificial Intelligence Laboratory, thinks consent can be baked into the design of the software that gathers our data for online research. He created Bartleby, a system for debriefing research participants and eliciting their views about social media research that involved them. Using Bartleby, he says, researchers can automatically direct each of their study participants to a website where they can learn about their involvement in research, view what data researchers collected about them, and give feedback. Most importantly, participants can use the website to opt out and request to delete their data.  

    Zong and his co-author, Nathan Matias SM ’13, PhD ’17, evaluated Bartleby by debriefing thousands of participants in observational and experimental studies on Twitter and Reddit. They found that Bartleby addresses procedural concerns by creating opportunities for participants to exercise autonomy, and the tool enabled substantive, value-driven conversations about participant voice and power. Here, Zong discusses the implications of their recent work as well as the future of social, ethical, and responsible computing.

    Q: Many leading tech ethicists and policymakers believe it’s impossible to keep people informed about their involvement in research and how their data are used. How has your work changed that?

    A: When Congress asked Mark Zuckerberg in 2018 about Facebook’s obligations to keep users informed about how their data is used, his answer was effectively that all users had the opportunity to read the privacy policy, and that being any clearer would be too difficult. Tech elites often blanket-statement that ethics is complicated, and proceed with their objective anyway. Many have claimed it’s impossible to fulfill ethical responsibilities to users at scale, so why try? But by creating Bartleby, a system for debriefing participants and eliciting their views about studies that involved them, we built something that shows that it’s not only very possible, but actually pretty easy to do. In a lot of situations, letting people know we want their data and explaining why we think it’s worth it is the bare minimum we could be doing.

    Q: Can ethical challenges be solved with a software tool?

    A: Off-the-shelf software actually can make a meaningful difference in respecting people’s autonomy. Ethics regulations almost never require a debriefing process for online studies. But because we used Bartleby, people had a chance to make an informed decision. It’s a chance they otherwise wouldn’t have had.

    At the same time, we realized that using Bartleby shined a light on deeper ethics questions that required substantive reflection. For example, most people are just trying to go about their lives and ignore the messages we send them, while others reply with concerns that aren’t even always about the research. Even if indirectly, these instances help signal nuances that research participants care about.

    Where might our values as researchers differ from participants’ values? How do the power structures that shape researchers’ interaction with users and communities affect our ability to see those differences? Using software to deliver ethics procedures helps bring these questions to light. But rather than expecting definitive answers that work in every situation, we should be thinking about how using software to create opportunities for participant voice and power challenges and invites us to reflect on how we address conflicting values.

    Q: How does your approach to design help suggest a way forward for social, ethical, and responsible computing?

    A: In addition to presenting the software tool, our peer-reviewed article on Bartleby also demonstrates a theoretical framework for data ethics, inspired by ideas in feminist philosophy. Because my work spans software design, empirical social science, and philosophy, I often think about the things I want people to take away in terms of interdisciplinary bridges I want to build. 

    I hope people look at Bartleby and see that ethics is an exciting area for technical innovation that can be tested empirically — guided by a clear-headed understanding of values. Umberto Eco, a philosopher, wrote that “form must not be a vehicle for thought, it must be a way of thinking.” In other words, designing software isn’t just about putting ideas we’ve already had into a computational form. Design is also a way we can think new ideas into existence, produce new ways of knowing and doing, and imagine alternative futures. More

  • in

    Estimating the informativeness of data

    Not all data are created equal. But how much information is any piece of data likely to contain? This question is central to medical testing, designing scientific experiments, and even to everyday human learning and thinking. MIT researchers have developed a new way to solve this problem, opening up new applications in medicine, scientific discovery, cognitive science, and artificial intelligence.

    In theory, the 1948 paper, “A Mathematical Theory of Communication,” by the late MIT Professor Emeritus Claude Shannon answered this question definitively. One of Shannon’s breakthrough results is the idea of entropy, which lets us quantify the amount of information inherent in any random object, including random variables that model observed data. Shannon’s results created the foundations of information theory and modern telecommunications. The concept of entropy has also proven central to computer science and machine learning.

    The challenge of estimating entropy

    Unfortunately, the use of Shannon’s formula can quickly become computationally intractable. It requires precisely calculating the probability of the data, which in turn requires calculating every possible way the data could have arisen under a probabilistic model. If the data-generating process is very simple — for example, a single toss of a coin or roll of a loaded die — then calculating entropies is straightforward. But consider the problem of medical testing, where a positive test result is the result of hundreds of interacting variables, all unknown. With just 10 unknowns, there are already 1,000 possible explanations for the data. With a few hundred, there are more possible explanations than atoms in the known universe, which makes calculating the entropy exactly an unmanageable problem.

    MIT researchers have developed a new method to estimate good approximations to many information quantities such as Shannon entropy by using probabilistic inference. The work appears in a paper presented at AISTATS 2022 by authors Feras Saad ’16, MEng ’16, a PhD candidate in electrical engineering and computer science; Marco-Cusumano Towner PhD ’21; and Vikash Mansinghka ’05, MEng ’09, PhD ’09, a principal research scientist in the Department of Brain and Cognitive Sciences. The key insight is, rather than enumerate all explanations, to instead use probabilistic inference algorithms to first infer which explanations are probable and then use these probable explanations to construct high-quality entropy estimates. The paper shows that this inference-based approach can be much faster and more accurate than previous approaches.

    Estimating entropy and information in a probabilistic model is fundamentally hard because it often requires solving a high-dimensional integration problem. Many previous works have developed estimators of these quantities for certain special cases, but the new estimators of entropy via inference (EEVI) offer the first approach that can deliver sharp upper and lower bounds on a broad set of information-theoretic quantities. An upper and lower bound means that although we don’t know the true entropy, we can get a number that is smaller than it and a number that is higher than it.

    “The upper and lower bounds on entropy delivered by our method are particularly useful for three reasons,” says Saad. “First, the difference between the upper and lower bounds gives a quantitative sense of how confident we should be about the estimates. Second, by using more computational effort we can drive the difference between the two bounds to zero, which ‘squeezes’ the true value with a high degree of accuracy. Third, we can compose these bounds to form estimates of many other quantities that tell us how informative different variables in a model are of one another.”

    Solving fundamental problems with data-driven expert systems

    Saad says he is most excited about the possibility that this method gives for querying probabilistic models in areas like machine-assisted medical diagnoses. He says one goal of the EEVI method is to be able to solve new queries using rich generative models for things like liver disease and diabetes that have already been developed by experts in the medical domain. For example, suppose we have a patient with a set of observed attributes (height, weight, age, etc.) and observed symptoms (nausea, blood pressure, etc.). Given these attributes and symptoms, EEVI can be used to help determine which medical tests for symptoms the physician should conduct to maximize information about the absence or presence of a given liver disease (like cirrhosis or primary biliary cholangitis).

    For insulin diagnosis, the authors showed how to use the method for computing optimal times to take blood glucose measurements that maximize information about a patient’s insulin sensitivity, given an expert-built probabilistic model of insulin metabolism and the patient’s personalized meal and medication schedule. As routine medical tracking like glucose monitoring moves away from doctor’s offices and toward wearable devices, there are even more opportunities to improve data acquisition, if the value of the data can be estimated accurately in advance.

    Vikash Mansinghka, senior author on the paper, adds, “We’ve shown that probabilistic inference algorithms can be used to estimate rigorous bounds on information measures that AI engineers often think of as intractable to calculate. This opens up many new applications. It also shows that inference may be more computationally fundamental than we thought. It also helps to explain how human minds might be able to estimate the value of information so pervasively, as a central building block of everyday cognition, and help us engineer AI expert systems that have these capabilities.”

    The paper, “Estimators of Entropy and Information via Inference in Probabilistic Models,” was presented at AISTATS 2022. More

  • in

    MIT Schwarzman College of Computing unveils Break Through Tech AI

    Aimed at driving diversity and inclusion in artificial intelligence, the MIT Stephen A. Schwarzman College of Computing is launching Break Through Tech AI, a new program to bridge the talent gap for women and underrepresented genders in AI positions in industry.

    Break Through Tech AI will provide skills-based training, industry-relevant portfolios, and mentoring to qualified undergraduate students in the Greater Boston area in order to position them more competitively for careers in data science, machine learning, and artificial intelligence. The free, 18-month program will also provide each student with a stipend for participation to lower the barrier for those typically unable to engage in an unpaid, extra-curricular educational opportunity.

    “Helping position students from diverse backgrounds to succeed in fields such as data science, machine learning, and artificial intelligence is critical for our society’s future,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and Henry Ellis Warren Professor of Electrical Engineering and Computer Science. “We look forward to working with students from across the Greater Boston area to provide them with skills and mentorship to help them find careers in this competitive and growing industry.”

    The college is collaborating with Break Through Tech — a national initiative launched by Cornell Tech in 2016 to increase the number of women and underrepresented groups graduating with degrees in computing — to host and administer the program locally. In addition to Boston, the inaugural artificial intelligence and machine learning program will be offered in two other metropolitan areas — one based in New York hosted by Cornell Tech and another in Los Angeles hosted by the University of California at Los Angeles Samueli School of Engineering.

    “Break Through Tech’s success at diversifying who is pursuing computer science degrees and careers has transformed lives and the industry,” says Judith Spitz, executive director of Break Through Tech. “With our new collaborators, we can apply our impactful model to drive inclusion and diversity in artificial intelligence.”

    The new program will kick off this summer at MIT with an eight-week, skills-based online course and in-person lab experience that teaches industry-relevant tools to build real-world AI solutions. Students will learn how to analyze datasets and use several common machine learning libraries to build, train, and implement their own ML models in a business context.

    Following the summer course, students will be matched with machine-learning challenge projects for which they will convene monthly at MIT and work in teams to build solutions and collaborate with an industry advisor or mentor throughout the academic year, resulting in a portfolio of resume-quality work. The participants will also be paired with young professionals in the field to help build their network, prepare their portfolio, practice for interviews, and cultivate workplace skills.

    “Leveraging the college’s strong partnership with industry, Break Through AI will offer unique opportunities to students that will enhance their portfolio in machine learning and AI,” says Asu Ozdaglar, deputy dean of academics of the MIT Schwarzman College of Computing and head of the Department of Electrical Engineering and Computer Science. Ozdaglar, who will be the MIT faculty director of Break Through Tech AI, adds: “The college is committed to making computing inclusive and accessible for all. We’re thrilled to host this program at MIT for the Greater Boston area and to do what we can to help increase diversity in computing fields.”

    Break Through Tech AI is part of the MIT Schwarzman College of Computing’s focus to advance diversity, equity, and inclusion in computing. The college aims to improve and create programs and activities that broaden participation in computing classes and degree programs, increase the diversity of top faculty candidates in computing fields, and ensure that faculty search and graduate admissions processes have diverse slates of candidates and interviews.

    “By engaging in activities like Break Through Tech AI that work to improve the climate for underrepresented groups, we’re taking an important step toward creating more welcoming environments where all members can innovate and thrive,” says Alana Anderson, assistant dean for diversity, equity and inclusion for the Schwarzman College of Computing. More

  • in

    Unlocking new doors to artificial intelligence

    Artificial intelligence research is constantly developing new hypotheses that have the potential to benefit society and industry; however, sometimes these benefits are not fully realized due to a lack of engineering tools. To help bridge this gap, graduate students in the MIT Department of Electrical Engineering and Computer Science’s 6-A Master of Engineering (MEng) Thesis Program work with some of the most innovative companies in the world and collaborate on cutting-edge projects, while contributing to and completing their MEng thesis.

    During a portion of the last year, four 6-A MEng students teamed up and completed an internship with IBM Research’s advanced prototyping team through the MIT-IBM Watson AI Lab on AI projects, often developing web applications to solve a real-world issue or business use cases. Here, the students worked alongside AI engineers, user experience engineers, full-stack researchers, and generalists to accommodate project requests and receive thesis advice, says Lee Martie, IBM research staff member and 6-A manager. The students’ projects ranged from generating synthetic data to allow for privacy-sensitive data analysis to using computer vision to identify actions in video that allows for monitoring human safety and tracking build progress on a construction site.

    “I appreciated all of the expertise from the team and the feedback,” says 6-A graduate Violetta Jusiega ’21, who participated in the program. “I think that working in industry gives the lens of making sure that the project’s needs are satisfied and [provides the opportunity] to ground research and make sure that it is helpful for some use case in the future.”

    Jusiega’s research intersected the fields of computer vision and design to focus on data visualization and user interfaces for the medical field. Working with IBM, she built an application programming interface (API) that let clinicians interact with a medical treatment strategy AI model, which was deployed in the cloud. Her interface provided a medical decision tree, as well as some prescribed treatment plans. After receiving feedback on her design from physicians at a local hospital, Jusiega developed iterations of the API and how the results where displayed, visually, so that it would be user-friendly and understandable for clinicians, who don’t usually code. She says that, “these tools are often not acquired into the field because they lack some of these API principles which become more important in an industry where everything is already very fast paced, so there’s little time to incorporate a new technology.” But this project might eventually allow for industry deployment. “I think this application has a bunch of potential, whether it does get picked up by clinicians or whether it’s simply used in research. It’s very promising and very exciting to see how technology can help us modify, or I can improve, the health-care field to be even more custom-tailored towards patients and giving them the best care possible,” she says.

    Another 6-A graduate student, Spencer Compton, was also considering aiding professionals to make more informed decisions, for use in settings including health care, but he was tackling it from a causal perspective. When given a set of related variables, Compton was investigating if there was a way to determine not just correlation, but the cause-and-effect relationship between them (the direction of the interaction) from the data alone. For this, he and his collaborators from IBM Research and Purdue University turned to a field of math called information theory. With the goal of designing an algorithm to learn complex networks of causal relationships, Compton used ideas relating to entropy, the randomness in a system, to help determine if a causal relationship is present and how variables might be interacting. “When judging an explanation, people often default to Occam’s razor” says Compton. “We’re more inclined to believe a simpler explanation than a more complex one.” In many cases, he says, it seemed to perform well. For instance, they were able to consider variables such as lung cancer, pollution, and X-ray findings. He was pleased that his research allowed him to help create a framework of “entropic causal inference” that could aid in safe and smart decisions in the future, in a satisfying way. “The math is really surprisingly deep, interesting, and complex,” says Compton. “We’re basically asking, ‘when is the simplest explanation correct?’ but as a math question.”

    Determining relationships within data can sometimes require large volumes of it to suss out patterns, but for data that may contain sensitive information, this may not be available. For her master’s work, Ivy Huang worked with IBM Research to generate synthetic tabular data using a natural language processing tool called a transformer model, which can learn and predict future values from past values. Trained on real data, the model can produce new data with similar patterns, properties, and relationships without restrictions like privacy, availability, and access that might come with real data in financial transactions and electronic medical records. Further, she created an API and deployed the model in an IBM cluster, which allowed users increased access to the model and abilities to query it without compromising the original data.

    Working with the advanced prototyping team, MEng candidate Brandon Perez also considered how to gather and investigate data with restrictions, but in his case it was to use computer vision frameworks, centered on an action recognition model, to identify construction site happenings. The team based their work on the Moments in Time dataset, which contains over a million three-second video clips with about 300 attached classification labels, and has performed well during AI training. However, the group needed more construction-based video data. For this, they used YouTube-8M. Perez built a framework for testing and fine-tuning existing object detection models and action recognition models that could plug into an automatic spatial and temporal localization tool — how they would identify and label particular actions in a video timeline. “I was satisfied that I was able to explore what made me curious, and I was grateful for the autonomy that I was given with this project,” says Perez. “I felt like I was always supported, and my mentor was a great support to the project.”

    “The kind of collaborations that we have seen between our MEng students and IBM researchers are exactly what the 6-A MEng Thesis program at MIT is all about,” says Tomas Palacios, professor of electrical engineering and faculty director of the MIT 6-A MEng Thesis program. “For more than 100 years, 6-A has been connecting MIT students with industry to solve together some of the most important problems in the world.” More

  • in

    Professors Elchanan Mossel and Rosalind Picard named 2021 ACM Fellows

    The Association for Computing Machinery (ACM) has named MIT professors Elchanan Mossel and Rosalind Picard as fellows for outstanding accomplishments in computing and information technology.

    The ACM Fellows program recognizes wide-ranging and fundamental contributions in areas including algorithms, computer science education, cryptography, data security and privacy, medical informatics, and mobile and networked systems, among many other areas. The accomplishments of the 2021 ACM Fellows underpin important innovations that shape the technologies we use every day.

    Elchanan Mossel

    Mossel is a professor of mathematics and a member at the Statistics and Data Science Center of the MIT Institute for Data, Systems and Society. His research in discrete functional inequalities, isoperimetry, and hypercontractivity led to the proof that Majority is Stablest and confirmed the optimality of the Goemans-Williamson MAX-CUT algorithm under the unique games conjecture from computational complexity. His work on the reconstruction problem on trees provides optimal algorithms and bounds for phylogenetic reconstruction in molecular biology and has led to sharp results in the analysis of Gibbs samplers from statistical physics and inference problems on graphs. His research has resolved open problems in computational biology, machine learning, social choice theory, and economics.Mossel received a BS from the Open University in Israel in 1992, and MS (1997) and PhD (2000) degrees in mathematics from the Hebrew University of Jerusalem. He was a postdoc at the Microsoft Research Theory Group and a Miller Fellow at University of California at Berkeley. He joined the UC Berkeley faculty in 2003 as a professor of statistics and computer science, and spent leaves as a professor at the Weizmann Institute and at the Wharton School before joining MIT in 2016 as a full professor.

    In 2020, he received the Vannevar Bush Faculty Fellowship of the U.S. Department of Defense. Other distinctions include being named a Simons Investigator in Mathematics in 2019, being selected as a fellow of the AMS, and receiving a Sloan Research Fellowship, NSF CAREER Award, and the Bergmann Memorial Award from the U.S.-Israel Binational Science Foundation.

    “I am honored by this award,” says Mossel. “It makes me realize how fortunate I’ve been, working with creative and generous colleagues, and mentoring brilliant young minds.”

    Rosalind Picard

    Picard is a scientist, engineer, author, and professor of media arts and sciences at the MIT Media Lab. She is recognized as the founder of the field of affective computing, and has carried this research forward as head of the Media Lab’s Affective Computing research group. She is also a founding faculty chair of MIT’s MindHandHeart Initiative, and a faculty member of the MIT Center for Neurobiological Engineering. Picard is an IEEE fellow, and a member of the National Academy of Engineering. 

    Picard’s inventions are in use by thousands of research teams worldwide as well as in numerous products and services. She has co-founded two companies: Affectiva (now part of Smart Eye), providing emotion AI technologies now used by more than 25 percent of the Global Fortune 500, and Empatica, providing wearable sensors and analytics to improve health. Starting from inventions by Picard and her team, Empatica created the first AI-based smart watch cleared by the FDA (in neurology for monitoring seizures), which is now helping to bring potentially lifesaving help for people with epilepsy. 

    “This award makes me think of how blessed I am to work with so many amazing people here at MIT, especially at the Media Lab,” Picard notes. “Whenever any one of us has our contributions recognized, it is also a recognition of how special a place this is.” More

  • in

    The promise and pitfalls of artificial intelligence explored at TEDxMIT event

    Scientists, students, and community members came together last month to discuss the promise and pitfalls of artificial intelligence at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) for the fourth TEDxMIT event held at MIT. 

    Attendees were entertained and challenged as they explored “the good and bad of computing,” explained CSAIL Director Professor Daniela Rus, who organized the event with John Werner, an MIT fellow and managing director of Link Ventures; MIT sophomore Lucy Zhao; and grad student Jessica Karaguesian. “As you listen to the talks today,” Rus told the audience, “consider how our world is made better by AI, and also our intrinsic responsibilities for ensuring that the technology is deployed for the greater good.”

    Rus mentioned some new capabilities that could be enabled by AI: an automated personal assistant that could monitor your sleep phases and wake you at the optimal time, as well as on-body sensors that monitor everything from your posture to your digestive system. “Intelligent assistance can help empower and augment our lives. But these intriguing possibilities should only be pursued if we can simultaneously resolve the challenges that these technologies bring,” said Rus. 

    The next speaker, CSAIL principal investigator and professor of electrical engineering and computer science Manolis Kellis, started off by suggesting what sounded like an unattainable goal — using AI to “put an end to evolution as we know it.” Looking at it from a computer science perspective, he said, what we call evolution is basically a brute force search. “You’re just exploring all of the search space, creating billions of copies of every one of your programs, and just letting them fight against each other. This is just brutal. And it’s also completely slow. It took us billions of years to get here.” Might it be possible, he asked, to speed up evolution and make it less messy?

    The answer, Kellis said, is that we can do better, and that we’re already doing better: “We’re not killing people like Sparta used to, throwing the weaklings off the mountain. We are truly saving diversity.”

    Knowledge, moreover, is now being widely shared, passed on “horizontally” through accessible information sources, he noted, rather than “vertically,” from parent to offspring. “I would like to argue that competition in the human species has been replaced by collaboration. Despite having a fixed cognitive hardware, we have software upgrades that are enabled by culture, by the 20 years that our children spend in school to fill their brains with everything that humanity has learned, regardless of which family came up with it. This is the secret of our great acceleration” — the fact that human advancement in recent centuries has vastly out-clipped evolution’s sluggish pace.

    The next step, Kellis said, is to harness insights about evolution in order to combat an individual’s genetic susceptibility to disease. “Our current approach is simply insufficient,” he added. “We’re treating manifestations of disease, not the causes of disease.” A key element in his lab’s ambitious strategy to transform medicine is to identify “the causal pathways through which genetic predisposition manifests. It’s only by understanding these pathways that we can truly manipulate disease causation and reverse the disease circuitry.” 

    Kellis was followed by Aleksander Madry, MIT professor of electrical engineering and computer science and CSAIL principal investigator, who told the crowd, “progress in AI is happening, and it’s happening fast.” Computer programs can routinely beat humans in games like chess, poker, and Go. So should we be worried about AI surpassing humans? 

    Madry, for one, is not afraid — or at least not yet. And some of that reassurance stems from research that has led him to the following conclusion: Despite its considerable success, AI, especially in the form of machine learning, is lazy. “Think about being lazy as this kind of smart student who doesn’t really want to study for an exam. Instead, what he does is just study all the past years’ exams and just look for patterns. Instead of trying to actually learn, he just tries to pass the test. And this is exactly the same way in which current AI is lazy.”

    A machine-learning model might recognize grazing sheep, for instance, simply by picking out pictures that have green grass in them. If a model is trained to identify fish from photos of anglers proudly displaying their catches, Madry explained, “the model figures out that if there’s a human holding something in the picture, I will just classify it as a fish.” The consequences can be more serious for an AI model intended to pick out malignant tumors. If the model is trained on images containing rulers that indicate the size of tumors, the model may end up selecting only those photos that have rulers in them.

    This leads to Madry’s biggest concerns about AI in its present form. “AI is beating us now,” he noted. “But the way it does it [involves] a little bit of cheating.” He fears that we will apply AI “in some way in which this mismatch between what the model actually does versus what we think it does will have some catastrophic consequences.” People relying on AI, especially in potentially life-or-death situations, need to be much more mindful of its current limitations, Madry cautioned.

    There were 10 speakers altogether, and the last to take the stage was MIT associate professor of electrical engineering and computer science and CSAIL principal investigator Marzyeh Ghassemi, who laid out her vision for how AI could best contribute to general health and well-being. But in order for that to happen, its models must be trained on accurate, diverse, and unbiased medical data.

    It’s important to focus on the data, Ghassemi stressed, because these models are learning from us. “Since our data is human-generated … a neural network is learning how to practice from a doctor. But doctors are human, and humans make mistakes. And if a human makes a mistake, and we train an AI from that, the AI will, too. Garbage in, garbage out. But it’s not like the garbage is distributed equally.”

    She pointed out that many subgroups receive worse care from medical practitioners, and members of these subgroups die from certain conditions at disproportionately high rates. This is an area, Ghassemi said, “where AI can actually help. This is something we can fix.” Her group is developing machine-learning models that are robust, private, and fair. What’s holding them back is neither algorithms nor GPUs. It’s data. Once we collect reliable data from diverse sources, Ghassemi added, we might start reaping the benefits that AI can bring to the realm of health care.

    In addition to CSAIL speakers, there were talks from members across MIT’s Institute for Data, Systems, and Society; the MIT Mobility Initiative; the MIT Media Lab; and the SENSEable City Lab.

    The proceedings concluded on that hopeful note. Rus and Werner then thanked everyone for coming. “Please continue to reflect about the good and bad of computing,” Rus urged. “And we look forward to seeing you back here in May for the next TEDxMIT event.”

    The exact theme of the spring 2022 gathering will have something to do with “superpowers.” But — if December’s mind-bending presentations were any indication — the May offering is almost certain to give its attendees plenty to think about. And maybe provide the inspiration for a startup or two. More

  • in

    Nonsense can make sense to machine-learning models

    For all that neural networks can accomplish, we still don’t really understand how they operate. Sure, we can program them to learn, but making sense of a machine’s decision-making process remains much like a fancy puzzle with a dizzying, complex pattern where plenty of integral pieces have yet to be fitted. 

    If a model was trying to classify an image of said puzzle, for example, it could encounter well-known, but annoying adversarial attacks, or even more run-of-the-mill data or processing issues. But a new, more subtle type of failure recently identified by MIT scientists is another cause for concern: “overinterpretation,” where algorithms make confident predictions based on details that don’t make sense to humans, like random patterns or image borders. 

    This could be particularly worrisome for high-stakes environments, like split-second decisions for self-driving cars, and medical diagnostics for diseases that need more immediate attention. Autonomous vehicles in particular rely heavily on systems that can accurately understand surroundings and then make quick, safe decisions. The network used specific backgrounds, edges, or particular patterns of the sky to classify traffic lights and street signs — irrespective of what else was in the image. 

    The team found that neural networks trained on popular datasets like CIFAR-10 and ImageNet suffered from overinterpretation. Models trained on CIFAR-10, for example, made confident predictions even when 95 percent of input images were missing, and the remainder is senseless to humans. 

    “Overinterpretation is a dataset problem that’s caused by these nonsensical signals in datasets. Not only are these high-confidence images unrecognizable, but they contain less than 10 percent of the original image in unimportant areas, such as borders. We found that these images were meaningless to humans, yet models can still classify them with high confidence,” says Brandon Carter, MIT Computer Science and Artificial Intelligence Laboratory PhD student and lead author on a paper about the research. 

    Deep-image classifiers are widely used. In addition to medical diagnosis and boosting autonomous vehicle technology, there are use cases in security, gaming, and even an app that tells you if something is or isn’t a hot dog, because sometimes we need reassurance. The tech in discussion works by processing individual pixels from tons of pre-labeled images for the network to “learn.” 

    Image classification is hard, because machine-learning models have the ability to latch onto these nonsensical subtle signals. Then, when image classifiers are trained on datasets such as ImageNet, they can make seemingly reliable predictions based on those signals. 

    Although these nonsensical signals can lead to model fragility in the real world, the signals are actually valid in the datasets, meaning overinterpretation can’t be diagnosed using typical evaluation methods based on that accuracy. 

    To find the rationale for the model’s prediction on a particular input, the methods in the present study start with the full image and repeatedly ask, what can I remove from this image? Essentially, it keeps covering up the image, until you’re left with the smallest piece that still makes a confident decision. 

    To that end, it could also be possible to use these methods as a type of validation criteria. For example, if you have an autonomously driving car that uses a trained machine-learning method for recognizing stop signs, you could test that method by identifying the smallest input subset that constitutes a stop sign. If that consists of a tree branch, a particular time of day, or something that’s not a stop sign, you could be concerned that the car might come to a stop at a place it’s not supposed to.

    While it may seem that the model is the likely culprit here, the datasets are more likely to blame. “There’s the question of how we can modify the datasets in a way that would enable models to be trained to more closely mimic how a human would think about classifying images and therefore, hopefully, generalize better in these real-world scenarios, like autonomous driving and medical diagnosis, so that the models don’t have this nonsensical behavior,” says Carter. 

    This may mean creating datasets in more controlled environments. Currently, it’s just pictures that are extracted from public domains that are then classified. But if you want to do object identification, for example, it might be necessary to train models with objects with an uninformative background. 

    This work was supported by Schmidt Futures and the National Institutes of Health. Carter wrote the paper alongside Siddhartha Jain and Jonas Mueller, scientists at Amazon, and MIT Professor David Gifford. They are presenting the work at the 2021 Conference on Neural Information Processing Systems. More

  • in

    Systems scientists find clues to why false news snowballs on social media

    The spread of misinformation on social media is a pressing societal problem that tech companies and policymakers continue to grapple with, yet those who study this issue still don’t have a deep understanding of why and how false news spreads.

    To shed some light on this murky topic, researchers at MIT developed a theoretical model of a Twitter-like social network to study how news is shared and explore situations where a non-credible news item will spread more widely than the truth. Agents in the model are driven by a desire to persuade others to take on their point of view: The key assumption in the model is that people bother to share something with their followers if they think it is persuasive and likely to move others closer to their mindset. Otherwise they won’t share.

    The researchers found that in such a setting, when a network is highly connected or the views of its members are sharply polarized, news that is likely to be false will spread more widely and travel deeper into the network than news with higher credibility.

    This theoretical work could inform empirical studies of the relationship between news credibility and the size of its spread, which might help social media companies adapt networks to limit the spread of false information.

    “We show that, even if people are rational in how they decide to share the news, this could still lead to the amplification of information with low credibility. With this persuasion motive, no matter how extreme my beliefs are — given that the more extreme they are the more I gain by moving others’ opinions — there is always someone who would amplify [the information],” says senior author Ali Jadbabaie, professor and head of the Department of Civil and Environmental Engineering and a core faculty member of the Institute for Data, Systems, and Society (IDSS) and a principal investigator in the Laboratory for Information and Decision Systems (LIDS).

    Joining Jadbabaie on the paper are first author Chin-Chia Hsu, a graduate student in the Social and Engineering Systems program in IDSS, and Amir Ajorlou, a LIDS research scientist. The research will be presented this week at the IEEE Conference on Decision and Control.

    Pondering persuasion

    This research draws on a 2018 study by Sinan Aral, the David Austin Professor of Management at the MIT Sloan School of Management; Deb Roy, an associate professor of media arts and sciences at the Media Lab; and former postdoc Soroush Vosoughi (now an assistant professor of computer science at Dartmouth University). Their empirical study of data from Twitter found that false news spreads wider, faster, and deeper than real news.

    Jadbabaie and his collaborators wanted to drill down on why this occurs.

    They hypothesized that persuasion might be a strong motive for sharing news — perhaps agents in the network want to persuade others to take on their point of view — and decided to build a theoretical model that would let them explore this possibility.

    In their model, agents have some prior belief about a policy, and their goal is to persuade followers to move their beliefs closer to the agent’s side of the spectrum.

    A news item is initially released to a small, random subgroup of agents, which must decide whether to share this news with their followers. An agent weighs the newsworthiness of the item and its credibility, and updates its belief based on how surprising or convincing the news is. 

    “They will make a cost-benefit analysis to see if, on average, this piece of news will move people closer to what they think or move them away. And we include a nominal cost for sharing. For instance, taking some action, if you are scrolling on social media, you have to stop to do that. Think of that as a cost. Or a reputation cost might come if I share something that is embarrassing. Everyone has this cost, so the more extreme and the more interesting the news is, the more you want to share it,” Jadbabaie says.

    If the news affirms the agent’s perspective and has persuasive power that outweighs the nominal cost, the agent will always share the news. But if an agent thinks the news item is something others may have already seen, the agent is disincentivized to share it.

    Since an agent’s willingness to share news is a product of its perspective and how persuasive the news is, the more extreme an agent’s perspective or the more surprising the news, the more likely the agent will share it.

    The researchers used this model to study how information spreads during a news cascade, which is an unbroken sharing chain that rapidly permeates the network.

    Connectivity and polarization

    The team found that when a network has high connectivity and the news is surprising, the credibility threshold for starting a news cascade is lower. High connectivity means that there are multiple connections between many users in the network.

    Likewise, when the network is largely polarized, there are plenty of agents with extreme views who want to share the news item, starting a news cascade. In both these instances, news with low credibility creates the largest cascades.

    “For any piece of news, there is a natural network speed limit, a range of connectivity, that facilitates good transmission of information where the size of the cascade is maximized by true news. But if you exceed that speed limit, you will get into situations where inaccurate news or news with low credibility has a larger cascade size,” Jadbabaie says.

    If the views of users in the network become more diverse, it is less likely that a poorly credible piece of news will spread more widely than the truth.

    Jadbabaie and his colleagues designed the agents in the network to behave rationally, so the model would better capture actions real humans might take if they want to persuade others.

    “Someone might say that is not why people share, and that is valid. Why people do certain things is a subject of intense debate in cognitive science, social psychology, neuroscience, economics, and political science,” he says. “Depending on your assumptions, you end up getting different results. But I feel like this assumption of persuasion being the motive is a natural assumption.”

    Their model also shows how costs can be manipulated to reduce the spread of false information. Agents make a cost-benefit analysis and won’t share news if the cost to do so outweighs the benefit of sharing.

    “We don’t make any policy prescriptions, but one thing this work suggests is that, perhaps, having some cost associated with sharing news is not a bad idea. The reason you get lots of these cascades is because the cost of sharing the news is actually very low,” he says.

    This work was supported by an Army Research Office Multidisciplinary University Research Initiative grant and a Vannevar Bush Fellowship from the Office of the Secretary of Defense. More