More stories

  • in

    Q&A: A fresh look at data science

    As the leaders of a developing field, data scientists must often deal with a frustratingly slippery question: What is data science, precisely, and what is it good for?

    Alfred Spector is a visiting scholar in the MIT Department of Electrical Engineering and Computer Science (EECS), an influential developer of distributed computing systems and applications, and a successful tech executive with companies including IBM and Google. Along with three co-authors — Peter Norvig at Stanford University and Google, Chris Wiggins at Columbia University and The New York Times, and Jeannette M. Wing at Columbia — Spector recently published “Data Science in Context: Foundations, Challenges, Opportunities” (Cambridge University Press), which provides a broad, conversational overview of the wide-ranging field driving change in sectors ranging from health care to transportation to commerce to entertainment. 

    Here, Spector talks about data-driven life, what makes a good data scientist, and how his book came together during the height of the Covid-19 pandemic.

    Q: One of the most common buzzwords Americans hear is “data-driven,” but many might not know what that term is supposed to mean. Can you unpack it for us?

    A: Data-driven broadly refers to techniques or algorithms powered by data — they either provide insight or reach conclusions, say, a recommendation or a prediction. The algorithms power models which are increasingly woven into the fabric of science, commerce, and life, and they often provide excellent results. The list of their successes is really too long to even begin to list. However, one concern is that the proliferation of data makes it easy for us as students, scientists, or just members of the public to jump to erroneous conclusions. As just one example, our own confirmation biases make us prone to believing some data elements or insights “prove” something we already believe to be true. Additionally, we often tend to see causal relationships where the data only shows correlation. It might seem paradoxical, but data science makes critical reading and analysis of data all the more important.

    Q: What, to your mind, makes a good data scientist?

    A: [In talking to students and colleagues] I optimistically emphasize the power of data science and the importance of gaining the computational, statistical, and machine learning skills to apply it. But, I also remind students that we are obligated to solve problems well. In our book, Chris [Wiggins] paraphrases danah boyd, who says that a successful application of data science is not one that merely meets some technical goal, but one that actually improves lives. More specifically, I exhort practitioners to provide a real solution to problems, or else clearly identify what we are not solving so that people see the limitations of our work. We should be extremely clear so that we do not generate harmful results or lead others to erroneous conclusions. I also remind people that all of us, including scientists and engineers, are human and subject to the same human foibles as everyone else, such as various biases. 

    Q: You discuss Covid-19 in your book. While some short-range models for mortality were very accurate during the heart of the pandemic, you note the failure of long-range models to predict any of 2020’s four major geotemporal Covid waves in the United States. Do you feel Covid was a uniquely hard situation to model? 

    A: Covid was particularly difficult to predict over the long term because of many factors — the virus was changing, human behavior was changing, political entities changed their minds. Also, we didn’t have fine-grained mobility data (perhaps, for good reasons), and we lacked sufficient scientific understanding of the virus, particularly in the first year.

    I think there are many other domains which are similarly difficult. Our book teases out many reasons why data-driven models may not be applicable. Perhaps it’s too difficult to get or hold the necessary data. Perhaps the past doesn’t predict the future. If data models are being used in life-and-death situations, we may not be able to make them sufficiently dependable; this is particularly true as we’ve seen all the motivations that bad actors have to find vulnerabilities. So, as we continue to apply data science, we need to think through all the requirements we have, and the capability of the field to meet them. They often align, but not always. And, as data science seeks to solve problems into ever more important areas such as human health, education, transportation safety, etc., there will be many challenges.

    Q: Let’s talk about the power of good visualization. You mention the popular, early 2000’s Baby Name Voyager website as one that changed your view on the importance of data visualization. Tell us how that happened. 

    A: That website, recently reborn as the Name Grapher, had two characteristics that I thought were brilliant. First, it had a really natural interface, where you type the initial characters of a name and it shows a frequency graph of all the names beginning with those letters, and their popularity over time. Second, it’s so much better than a spreadsheet with 140 columns representing years and rows representing names, despite the fact it contains no extra information. It also provided instantaneous feedback with its display graph dynamically changing as you type. To me, this showed the power of a very simple transformation that is done correctly.

    Q: When you and your co-authors began planning “Data Science In Context,” what did you hope to offer?

    A: We portray present data science as a field that’s already had enormous benefits, that provides even more future opportunities, but one that requires equally enormous care in its use. Referencing the word “context” in the title, we explain that the proper use of data science must consider the specifics of the application, the laws and norms of the society in which the application is used, and even the time period of its deployment. And, importantly for an MIT audience, the practice of data science must go beyond just the data and the model to the careful consideration of an application’s objectives, its security, privacy, abuse, and resilience risks, and even the understandability it conveys to humans. Within this expansive notion of context, we finally explain that data scientists must also carefully consider ethical trade-offs and societal implications.

    Q: How did you keep focus throughout the process?

    A: Much like in open-source projects, I played both the coordinating author role and also the role of overall librarian of all the material, but we all made significant contributions. Chris Wiggins is very knowledgeable on the Belmont principles and applied ethics; he was the major contributor of those sections. Peter Norvig, as the coauthor of a bestselling AI textbook, was particularly involved in the sections on building models and causality. Jeannette Wing worked with me very closely on our seven-element Analysis Rubric and recognized that a checklist for data science practitioners would end up being one of our book’s most important contributions. 

    From a nuts-and-bolts perspective, we wrote the book during Covid, using one large shared Google doc with weekly video conferences. Amazingly enough, Chris, Jeannette, and I didn’t meet in person at all, and Peter and I met only once — sitting outdoors on a wooden bench on the Stanford campus.

    Q: That is an unusual way to write a book! Do you recommend it?

    A: It would be nice to have had more social interaction, but a shared document, at least with a coordinating author, worked pretty well for something up to this size. The benefit is that we always had a single, coherent textual base, not dissimilar to how a programming team works together.

    This is a condensed, edited version of a longer interview that originally appeared on the MIT EECS website. More

  • in

    3 Questions: Why cybersecurity is on the agenda for corporate boards of directors

    Organizations of every size and in every industry are vulnerable to cybersecurity risks — a dynamic landscape of threats and vulnerabilities and a corresponding overload of possible mitigating controls. MIT Senior Lecturer Keri Pearlson, who is also the executive director of the research consortium Cybersecurity at MIT Sloan (CAMS) and an instructor for the new MIT Sloan Executive Education course Cybersecurity Governance for the Board of Directors, knows how business can get ahead of this risk. Here, she describes the current threat and explores how boards can mitigate their risk against cybercrime.

    Q: What does the current state of cyberattacks mean for businesses in 2023?

    A: Last year we were discussing how the pandemic heightened fear, uncertainty, doubt and chaos, opening new doors for malicious actors to do their cyber mischief in our organizations and our families. We saw an increase in ransomware and other cyber attacks, and we saw an increase in concern from operating executives and board of directors wondering how to keep the organization secure. Since then, we have seen a continued escalation of cyber incidents, many of which no longer make the headlines unless they are wildly unique, damaging, or different than previous incidents. For every new technology that cybersecurity professionals invent, it’s only a matter of time until malicious actors find a way around it. New leadership approaches are needed for 2023 as we move into the next phase of securing our organizations.

    In great part, this means ensuring deep cybersecurity competencies on our boards of directors. Cyber risk is so significant that a responsible board can no longer ignore it or just delegate it to risk management experts. In fact, an organization’s board of directors holds a uniquely vital role in safeguarding data and systems for the future because of their fiduciary responsibility to shareholders and their responsibility to oversee and mitigate business risk.

    As these cyber threats increase, and as companies bolster their cybersecurity budgets accordingly, the regulatory community is also advancing new requirements of companies. In March of this year, the SEC issued a proposed rule titled Cybersecurity Risk Management, Strategy, Governance, and Incident Disclosure. In it, the SEC describes its intention to require public companies to disclose whether their boards have members with cybersecurity expertise. Specifically, registrants will be required to disclose whether the entire board, a specific board member, or a board committee is responsible for the oversight of cyber risks; the processes by which the board is informed about cyber risks, and the frequency of its discussions on this topic; and whether and how the board or specified board committee considers cyber risks as part of its business strategy, risk management, and financial oversight.

    Q: How can boards help their organizations mitigate cyber risk?

    A: According to the studies I’ve conducted with my CAMS colleagues, most organizations focus on cyber protection rather than cyber resilience, and we believe that is a mistake. A company that invests only in protection is not managing the risk associated with getting up and running again in the event of a cyber incident, and they are not going to be able to respond appropriately to new regulations, either. Resiliency means having a practical plan for recovery and business continuation.

    Certainly, protection is part of the resilience equation, but if the pandemic taught us anything, it taught us that resilience is the ability to weather an attack and recover quickly with minimal impact to our operations. The ultimate goal of a cyber-resilient organization would be zero disruption from a cyber breach — no impact on operations, finances, technologies, supply chain or reputation. Board members should ask, What would it take for this to be the case? And they should ensure that executives and managers have made proper and appropriate preparations to respond and recover.

    Being a knowledgeable board member does not mean becoming a cybersecurity expert, but it does mean understanding basic concepts, risks, frameworks, and approaches. And it means having the ability to assess whether management appropriately comprehends related threats, has an appropriate cyber strategy, and can measure its effectiveness. Board members today require focused training on these critical areas to carry out their mission. Unfortunately, many enterprises fail to leverage their boards of directors in this capacity or prepare board members to actively contribute to strategy, protocols, and emergency action plans.

    Alongside my CAMS colleagues Stuart Madnick and Kevin Powers, I’m teaching a new  MIT Sloan Executive Education course, Cybersecurity Governance for the Board of Directors, designed to help organizations and their boards get up to speed. Participants will explore the board’s role in cybersecurity, as well as breach planning, response, and mitigation. And we will discuss the impact and requirements of the many new regulations coming forward, not just from the SEC, but also White House, Congress, and most states and countries around the world, which are imposing more high-level responsibilities on companies.

    Q: What are some examples of how companies, and specifically boards of directors, have successfully upped their cybersecurity game?

    A: To ensure boardroom skills reflect the patterns of the marketplace, companies such as FedEx, Hasbro, PNC, and UPS have transformed their approach to governing cyber risk, starting with board cyber expertise. In companies like these, building resiliency started with a clear plan — from the boardroom — built on business and economic analysis.

    In one company we looked at, the CEO realized his board was not well versed in the business context or financial exposure risk from a cyber attack, so he hired a third-party consulting firm to conduct a cybersecurity maturity assessment. The company CISO presented the results of the report to the enterprise risk management subcommittee, creating a productive dialogue around the business and financial impact of different investments in cybersecurity.  

    Another organization focused their board on the alignment of their cybersecurity program and operational risk. The CISO, chief risk officer, and board collaborated to understand the exposure of the organization from a risk perspective, resulting in optimizing their cyber insurance policy to mitigate the newly understood risk.

    One important takeaway from these examples is the importance of using the language of risk, resiliency, and reputation to bridge the gaps between technical cybersecurity needs and the oversight responsibilities executed by boards. Boards need to understand the financial exposure resulting from cyber risk, not just the technical components typically found in cyber presentations.

    Cyber risk is not going away. It’s escalating and becoming more sophisticated every day. Getting your board “on board” is key to meeting new guidelines, providing sufficient oversight to cybersecurity plans, and making organizations more resilient. More

  • in

    Q&A: Global challenges surrounding the deployment of AI

    The AI Policy Forum (AIPF) is an initiative of the MIT Schwarzman College of Computing to move the global conversation about the impact of artificial intelligence from principles to practical policy implementation. Formed in late 2020, AIPF brings together leaders in government, business, and academia to develop approaches to address the societal challenges posed by the rapid advances and increasing applicability of AI.

    The co-chairs of the AI Policy Forum are Aleksander Madry, the Cadence Design Systems Professor; Asu Ozdaglar, deputy dean of academics for the MIT Schwarzman College of Computing and head of the Department of Electrical Engineering and Computer Science; and Luis Videgaray, senior lecturer at MIT Sloan School of Management and director of MIT AI Policy for the World Project. Here, they discuss talk some of the key issues facing the AI policy landscape today and the challenges surrounding the deployment of AI. The three are co-organizers of the upcoming AI Policy Forum Summit on Sept. 28, which will further explore the issues discussed here.

    Q: Can you talk about the ­ongoing work of the AI Policy Forum and the AI policy landscape generally?

    Ozdaglar: There is no shortage of discussion about AI at different venues, but conversations are often high-level, focused on questions of ethics and principles, or on policy problems alone. The approach the AIPF takes to its work is to target specific questions with actionable policy solutions and engage with the stakeholders working directly in these areas. We work “behind the scenes” with smaller focus groups to tackle these challenges and aim to bring visibility to some potential solutions alongside the players working directly on them through larger gatherings.

    Q: AI impacts many sectors, which makes us naturally worry about its trustworthiness. Are there any emerging best practices for development and deployment of trustworthy AI?

    Madry: The most important thing to understand regarding deploying trustworthy AI is that AI technology isn’t some natural, preordained phenomenon. It is something built by people. People who are making certain design decisions.

    We thus need to advance research that can guide these decisions as well as provide more desirable solutions. But we also need to be deliberate and think carefully about the incentives that drive these decisions. 

    Now, these incentives stem largely from the business considerations, but not exclusively so. That is, we should also recognize that proper laws and regulations, as well as establishing thoughtful industry standards have a big role to play here too.

    Indeed, governments can put in place rules that prioritize the value of deploying AI while being keenly aware of the corresponding downsides, pitfalls, and impossibilities. The design of such rules will be an ongoing and evolving process as the technology continues to improve and change, and we need to adapt to socio-political realities as well.

    Q: Perhaps one of the most rapidly evolving domains in AI deployment is in the financial sector. From a policy perspective, how should governments, regulators, and lawmakers make AI work best for consumers in finance?

    Videgaray: The financial sector is seeing a number of trends that present policy challenges at the intersection of AI systems. For one, there is the issue of explainability. By law (in the U.S. and in many other countries), lenders need to provide explanations to customers when they take actions deleterious in whatever way, like denial of a loan, to a customer’s interest. However, as financial services increasingly rely on automated systems and machine learning models, the capacity of banks to unpack the “black box” of machine learning to provide that level of mandated explanation becomes tenuous. So how should the finance industry and its regulators adapt to this advance in technology? Perhaps we need new standards and expectations, as well as tools to meet these legal requirements.

    Meanwhile, economies of scale and data network effects are leading to a proliferation of AI outsourcing, and more broadly, AI-as-a-service is becoming increasingly common in the finance industry. In particular, we are seeing fintech companies provide the tools for underwriting to other financial institutions — be it large banks or small, local credit unions. What does this segmentation of the supply chain mean for the industry? Who is accountable for the potential problems in AI systems deployed through several layers of outsourcing? How can regulators adapt to guarantee their mandates of financial stability, fairness, and other societal standards?

    Q: Social media is one of the most controversial sectors of the economy, resulting in many societal shifts and disruptions around the world. What policies or reforms might be needed to best ensure social media is a force for public good and not public harm?

    Ozdaglar: The role of social media in society is of growing concern to many, but the nature of these concerns can vary quite a bit — with some seeing social media as not doing enough to prevent, for example, misinformation and extremism, and others seeing it as unduly silencing certain viewpoints. This lack of unified view on what the problem is impacts the capacity to enact any change. All of that is additionally coupled with the complexities of the legal framework in the U.S. spanning the First Amendment, Section 230 of the Communications Decency Act, and trade laws.

    However, these difficulties in regulating social media do not mean that there is nothing to be done. Indeed, regulators have begun to tighten their control over social media companies, both in the United States and abroad, be it through antitrust procedures or other means. In particular, Ofcom in the U.K. and the European Union is already introducing new layers of oversight to platforms. Additionally, some have proposed taxes on online advertising to address the negative externalities caused by current social media business model. So, the policy tools are there, if the political will and proper guidance exists to implement them. More

  • in

    3 Questions: Marking the 10th anniversary of the Higgs boson discovery

    This July 4 marks 10 years since the discovery of the Higgs boson, the long-sought particle that imparts mass to all elementary particles. The elusive particle was the last missing piece in the Standard Model of particle physics, which is our most complete model of the universe.

    In early summer of 2012, signs of the Higgs particle were detected in the Large Hadron Collider (LHC), the world’s largest particle accelerator, which is operated by CERN, the European Organization for Nuclear Research. The LHC is engineered to smash together billions upon billions of protons for the chance at producing the Higgs boson and other particles that are predicted to have been created in the early universe.

    In analyzing the products of countless proton-on-proton collisions, scientists registered a Higgs-like signal in the accelerator’s two independent detectors, ATLAS and CMS (the Compact Muon Solenoid). Specifically, the teams observed signs that a new particle had been created and then decayed to two photons, two Z bosons or two W bosons, and that this new particle was likely the Higgs boson.

    The discovery was revealed within the CMS collaboration, including over 3,000 scientists, on June 15, and ATLAS and CMS announced their respective observations to the world on July 4. More than 50 MIT physicists and students contributed to the CMS experiment, including Christoph Paus, professor of physics, who was one of the experiment’s two lead investigators to organize the search for the Higgs boson.

    As the LHC prepares to start back up on July 5 with “Run 3,” MIT News spoke with Paus about what physicists have learned about the Higgs particle in the last 10 years, and what they hope to discover with this next deluge of particle data.

    Q: Looking back, what do you remember as the key moments leading up to the Higgs boson’s discovery?

    A: I remember that by the end of 2011, we had taken a significant amount of data, and there were some first hints that there could be something, but nothing that was conclusive enough. It was clear to everybody that we were entering the critical phase of a potential discovery. We still wanted to improve our searches, and so we decided, which I felt was one of the most important decisions we took, that we had to remove the bias — that is, remove our knowledge about where the signal could appear. Because it’s dangerous as a scientist to say, “I know the solution,” which can influence the result unconsciously. So, we made that decision together in the coordination group and said, we are going to get rid of this bias by doing what people refer to as a “blind” analysis. This allowed the analyzers to focus on the technical aspects, making sure everything was correct without having to worry about being influenced by what they saw.

    Then, of course, there had to be the moment where we unblind the data and really look to see, is the Higgs there or not. And about two weeks before the scheduled presentations on July 4 where we eventually announced the discovery, there was a meeting on June 15 to show the analysis with its results to the collaboration. The most significant analysis turned out to be the two-photon analysis. One of my students, Joshua Bendavid PhD ’13, was leading that analysis, and the night before the meeting, only he and another person on the team were allowed to unblind the data. They were working until 2 in the morning, when they finally pushed a button to see what it looks like. And they were the first in CMS to have that moment of seeing that [the Higgs boson] was there. Another student of mine who was working on this analysis, Mingming Yang PhD ’15, presented the results of that search to the Collaboration at CERN that following afternoon. It was a very exciting moment for all of us. The room was hot and filled with electricity.

    The scientific process of the discovery was very well-designed and executed, and I think it can serve as a blueprint for how people should do such searches.

    Q: What more have scientists learned of the Higgs boson since the particle’s detection?

    A: At the time of the discovery, something interesting happened I did not really expect. While we were always talking about the Higgs boson before, we became very careful once we saw that “narrow peak.” How could we be sure that it was the Higgs boson and not something else? It certainly looked like the Higgs boson, but our vision was quite blurry. It could have turned out in the following years that it was not the Higgs boson. But as we now know, with so much more data, everything is completely consistent with what the Higgs boson is predicted to look like, so we became comfortable with calling the narrow resonance not just a Higgs-like particle but rather simply the Higgs boson. And there were a few milestones that made sure this is really the Higgs as we know it.

    The initial discovery was based on Higgs bosons decaying to two photons, two Z bosons or two W bosons. That was only a small fraction of decays that the Higgs could undergo. There are many more. The amount of decays of the Higgs boson into a particular set of particles depends critically on their masses. This characteristic feature is essential to confirm that we are really dealing with the Higgs boson.

    What we found since then is that the Higgs boson does not only decay to bosons, but also to fermions, which is not obvious because bosons are force carrier particles while fermions are matter particles. The first new decay was the decay to tau leptons, the heavier sibling of the electron. The next step was the observation of the Higgs boson decaying to b quarks, the heaviest quark that the Higgs can decay to. The b quark is the heaviest sibling of the down quark, which is a building block of protons and neutrons and thus all atomic nuclei around us. These two fermions are part of the heaviest generation of fermions in the standard model. Only recently the Higgs boson was observed to decay to muons, the charge lepton of the second and thus lighter generation, at the expected rate. Also, the direct coupling to the heaviest  top quark was established, which spans together with the muons four orders of magnitudes in terms of their masses, and the Higgs coupling behaves as expected over this wide range.

    Q: As the Large Hadron Collider gears up for its new “Run 3,” what do you hope to discover next?

    One very interesting question that Run 3 might give us some first hints on is the self-coupling of the Higgs boson. As the Higgs couples to any massive particle, it can also couple to itself. It is unlikely that there is enough data to make a discovery, but first hints of this coupling would be very exciting to see, and this constitutes a fundamentally different test than what has been done so far.

    Another interesting aspect that more data will help to elucidate is the question of whether the Higgs boson might be a portal and decay to invisible particles that could be candidates for explaining the mystery of dark matter in the universe. This is not predicted in our standard model and thus would unveil the Higgs boson as an imposter.

    Of course, we want to double down on all the measurements we have made so far and see whether they continue to line up with our expectations.

    This is true also for the upcoming major upgrade of the LHC (runs starting in 2029) for what we refer to as the High Luminosity LHC (HL-LHC). Another factor of 10 more events will be accumulated during this program, which for the Higgs boson means we will be able to observe its self-coupling. For the far future, there are plans for a Future Circular Collider, which could ultimately measure the total decay width of the Higgs boson independent of its decay mode, which would be another important and very precise test whether the Higgs boson is an imposter.

    As any other good physicist, I hope though that we can find a crack in the armor of the Standard Model, which is so far holding up all too well. There are a number of very important observations, for example the nature of dark matter, that cannot be explained using the Standard Model. All of our future studies, from Run 3 starting on July 5 to the very in the future FCC, will give us access to entirely uncharted territory. New phenomena can pop up, and I like to be optimistic. More

  • in

    3 Questions: Designing software for research ethics

    Data are arguably the world’s hottest form of currency, clocking in zeros and ones that hold ever more weight than before. But with all of our personal information being crunched into dynamite for enterprise solutions and the like, with a lack of consumer data protection, are we all getting left behind? 

    Jonathan Zong, a PhD candidate in electrical engineering and computer science at MIT, and an affiliate of the Computer Science and Artificial Intelligence Laboratory, thinks consent can be baked into the design of the software that gathers our data for online research. He created Bartleby, a system for debriefing research participants and eliciting their views about social media research that involved them. Using Bartleby, he says, researchers can automatically direct each of their study participants to a website where they can learn about their involvement in research, view what data researchers collected about them, and give feedback. Most importantly, participants can use the website to opt out and request to delete their data.  

    Zong and his co-author, Nathan Matias SM ’13, PhD ’17, evaluated Bartleby by debriefing thousands of participants in observational and experimental studies on Twitter and Reddit. They found that Bartleby addresses procedural concerns by creating opportunities for participants to exercise autonomy, and the tool enabled substantive, value-driven conversations about participant voice and power. Here, Zong discusses the implications of their recent work as well as the future of social, ethical, and responsible computing.

    Q: Many leading tech ethicists and policymakers believe it’s impossible to keep people informed about their involvement in research and how their data are used. How has your work changed that?

    A: When Congress asked Mark Zuckerberg in 2018 about Facebook’s obligations to keep users informed about how their data is used, his answer was effectively that all users had the opportunity to read the privacy policy, and that being any clearer would be too difficult. Tech elites often blanket-statement that ethics is complicated, and proceed with their objective anyway. Many have claimed it’s impossible to fulfill ethical responsibilities to users at scale, so why try? But by creating Bartleby, a system for debriefing participants and eliciting their views about studies that involved them, we built something that shows that it’s not only very possible, but actually pretty easy to do. In a lot of situations, letting people know we want their data and explaining why we think it’s worth it is the bare minimum we could be doing.

    Q: Can ethical challenges be solved with a software tool?

    A: Off-the-shelf software actually can make a meaningful difference in respecting people’s autonomy. Ethics regulations almost never require a debriefing process for online studies. But because we used Bartleby, people had a chance to make an informed decision. It’s a chance they otherwise wouldn’t have had.

    At the same time, we realized that using Bartleby shined a light on deeper ethics questions that required substantive reflection. For example, most people are just trying to go about their lives and ignore the messages we send them, while others reply with concerns that aren’t even always about the research. Even if indirectly, these instances help signal nuances that research participants care about.

    Where might our values as researchers differ from participants’ values? How do the power structures that shape researchers’ interaction with users and communities affect our ability to see those differences? Using software to deliver ethics procedures helps bring these questions to light. But rather than expecting definitive answers that work in every situation, we should be thinking about how using software to create opportunities for participant voice and power challenges and invites us to reflect on how we address conflicting values.

    Q: How does your approach to design help suggest a way forward for social, ethical, and responsible computing?

    A: In addition to presenting the software tool, our peer-reviewed article on Bartleby also demonstrates a theoretical framework for data ethics, inspired by ideas in feminist philosophy. Because my work spans software design, empirical social science, and philosophy, I often think about the things I want people to take away in terms of interdisciplinary bridges I want to build. 

    I hope people look at Bartleby and see that ethics is an exciting area for technical innovation that can be tested empirically — guided by a clear-headed understanding of values. Umberto Eco, a philosopher, wrote that “form must not be a vehicle for thought, it must be a way of thinking.” In other words, designing software isn’t just about putting ideas we’ve already had into a computational form. Design is also a way we can think new ideas into existence, produce new ways of knowing and doing, and imagine alternative futures. More

  • in

    MIT announces five flagship projects in first-ever Climate Grand Challenges competition

    MIT today announced the five flagship projects selected in its first-ever Climate Grand Challenges competition. These multiyear projects will define a dynamic research agenda focused on unraveling some of the toughest unsolved climate problems and bringing high-impact, science-based solutions to the world on an accelerated basis.

    Representing the most promising concepts to emerge from the two-year competition, the five flagship projects will receive additional funding and resources from MIT and others to develop their ideas and swiftly transform them into practical solutions at scale.

    “Climate Grand Challenges represents a whole-of-MIT drive to develop game-changing advances to confront the escalating climate crisis, in time to make a difference,” says MIT President L. Rafael Reif. “We are inspired by the creativity and boldness of the flagship ideas and by their potential to make a significant contribution to the global climate response. But given the planet-wide scale of the challenge, success depends on partnership. We are eager to work with visionary leaders in every sector to accelerate this impact-oriented research, implement serious solutions at scale, and inspire others to join us in confronting this urgent challenge for humankind.”

    Brief descriptions of the five Climate Grand Challenges flagship projects are provided below.

    Bringing Computation to the Climate Challenge

    This project leverages advances in artificial intelligence, machine learning, and data sciences to improve the accuracy of climate models and make them more useful to a variety of stakeholders — from communities to industry. The team is developing a digital twin of the Earth that harnesses more data than ever before to reduce and quantify uncertainties in climate projections.

    Research leads: Raffaele Ferrari, the Cecil and Ida Green Professor of Oceanography in the Department of Earth, Atmospheric and Planetary Sciences, and director of the Program in Atmospheres, Oceans, and Climate; and Noelle Eckley Selin, director of the Technology and Policy Program and professor with a joint appointment in the Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences

    Center for Electrification and Decarbonization of Industry

    This project seeks to reinvent and electrify the processes and materials behind hard-to-decarbonize industries like steel, cement, ammonia, and ethylene production. A new innovation hub will perform targeted fundamental research and engineering with urgency, pushing the technological envelope on electricity-driven chemical transformations.

    Research leads: Yet-Ming Chiang, the Kyocera Professor of Materials Science and Engineering, and Bilge Yıldız, the Breene M. Kerr Professor in the Department of Nuclear Science and Engineering and professor in the Department of Materials Science and Engineering

    Preparing for a new world of weather and climate extremes

    This project addresses key gaps in knowledge about intensifying extreme events such as floods, hurricanes, and heat waves, and quantifies their long-term risk in a changing climate. The team is developing a scalable climate-change adaptation toolkit to help vulnerable communities and low-carbon energy providers prepare for these extreme weather events.

    Research leads: Kerry Emanuel, the Cecil and Ida Green Professor of Atmospheric Science in the Department of Earth, Atmospheric and Planetary Sciences and co-director of the MIT Lorenz Center; Miho Mazereeuw, associate professor of architecture and urbanism in the Department of Architecture and director of the Urban Risk Lab; and Paul O’Gorman, professor in the Program in Atmospheres, Oceans, and Climate in the Department of Earth, Atmospheric and Planetary Sciences

    The Climate Resilience Early Warning System

    The CREWSnet project seeks to reinvent climate change adaptation with a novel forecasting system that empowers underserved communities to interpret local climate risk, proactively plan for their futures incorporating resilience strategies, and minimize losses. CREWSnet will initially be demonstrated in southwestern Bangladesh, serving as a model for similarly threatened regions around the world.

    Research leads: John Aldridge, assistant leader of the Humanitarian Assistance and Disaster Relief Systems Group at MIT Lincoln Laboratory, and Elfatih Eltahir, the H.M. King Bhumibol Professor of Hydrology and Climate in the Department of Civil and Environmental Engineering

    Revolutionizing agriculture with low-emissions, resilient crops

    This project works to revolutionize the agricultural sector with climate-resilient crops and fertilizers that have the ability to dramatically reduce greenhouse gas emissions from food production.

    Research lead: Christopher Voigt, the Daniel I.C. Wang Professor in the Department of Biological Engineering

    “As one of the world’s leading institutions of research and innovation, it is incumbent upon MIT to draw on our depth of knowledge, ingenuity, and ambition to tackle the hard climate problems now confronting the world,” says Richard Lester, MIT associate provost for international activities. “Together with collaborators across industry, finance, community, and government, the Climate Grand Challenges teams are looking to develop and implement high-impact, path-breaking climate solutions rapidly and at a grand scale.”

    The initial call for ideas in 2020 yielded nearly 100 letters of interest from almost 400 faculty members and senior researchers, representing 90 percent of MIT departments. After an extensive evaluation, 27 finalist teams received a total of $2.7 million to develop comprehensive research and innovation plans. The projects address four broad research themes:

    To select the winning projects, research plans were reviewed by panels of international experts representing relevant scientific and technical domains as well as experts in processes and policies for innovation and scalability.

    “In response to climate change, the world really needs to do two things quickly: deploy the solutions we already have much more widely, and develop new solutions that are urgently needed to tackle this intensifying threat,” says Maria Zuber, MIT vice president for research. “These five flagship projects exemplify MIT’s strong determination to bring its knowledge and expertise to bear in generating new ideas and solutions that will help solve the climate problem.”

    “The Climate Grand Challenges flagship projects set a new standard for inclusive climate solutions that can be adapted and implemented across the globe,” says MIT Chancellor Melissa Nobles. “This competition propels the entire MIT research community — faculty, students, postdocs, and staff — to act with urgency around a worsening climate crisis, and I look forward to seeing the difference these projects can make.”

    “MIT’s efforts on climate research amid the climate crisis was a primary reason that I chose to attend MIT, and remains a reason that I view the Institute favorably. MIT has a clear opportunity to be a thought leader in the climate space in our own MIT way, which is why CGC fits in so well,” says senior Megan Xu, who served on the Climate Grand Challenges student committee and is studying ways to make the food system more sustainable.

    The Climate Grand Challenges competition is a key initiative of “Fast Forward: MIT’s Climate Action Plan for the Decade,” which the Institute published in May 2021. Fast Forward outlines MIT’s comprehensive plan for helping the world address the climate crisis. It consists of five broad areas of action: sparking innovation, educating future generations, informing and leveraging government action, reducing MIT’s own climate impact, and uniting and coordinating all of MIT’s climate efforts. More

  • in

    3 Questions: Fotini Christia on racial equity and data science

    Fotini Christia is the Ford International Professor in the Social Sciences in the Department of Political Science, associate director of the Institute for Data, Systems, and Society (IDSS), and director of the Sociotechnical Systems Research Center (SSRC). Her research interests include issues of conflict and cooperation in the Muslim world, and she has conducted fieldwork in Afghanistan, Bosnia, Iran, the Palestinian Territories, Syria, and Yemen. She has co-organized the IDSS Research Initiative on Combatting Systemic Racism (ICSR), which works to bridge the social sciences, data science, and computation by bringing researchers from these disciplines together to address systemic racism across housing, health care, policing, education, employment, and other sectors of society.

    Q: What is the IDSS/ICSR approach to systemic racism research?

    A: The Research Initiative on Combatting Systemic Racism (ICSR) aims to seed and coordinate cross-disciplinary research to identify and overcome racially discriminatory processes and outcomes across a range of U.S. institutions and policy domains.

    Building off the extensive social science literature on systemic racism, the focus of this research initiative is to use big data to develop and harness computational tools that can help effect structural and normative change toward racial equity.

    The initiative aims to create a visible presence at MIT for cutting-edge computational research with a racial equity lens, across societal domains that will attract and train students and scholars.

    The steering committee for this research initiative is composed of underrepresented minority faculty members from across MIT’s five schools and the MIT Schwarzman College of Computing. Members will serve as close advisors to the initiative as well as share the findings of our work beyond MIT’s campus. MIT Chancellor Melissa Nobles heads this committee.

    Q: What role can data science play in helping to effect change toward racial equity?

    A: Existing work has shown racial discrimination in the job market, in the criminal justice system, as well as in education, health care, and access to housing, among other places. It has also underlined how algorithms could further entrench such bias — be it in training data or in the people who build them. Data science tools can not only help identify, but also contribute to, proposing fixes on racially inequitable outcomes that result from implicit or explicit biases in governing institutional practices in the public and private sector, and more recently from the use of AI and algorithmic methods in decision-making.

    To that effect, this initiative will produce research that explores and collects the relevant big data across domains, while paying attention to the ways such data are collected, and focus on improving and developing data-driven computational tools to address racial disparities in structures and institutions that have reproduced racially discriminatory outcomes in American society.

    The strong correlation between race, class, educational attainment, and various attitudes and behaviors in the American context can make it extremely difficult to rule out the influence of confounding factors. Thus, a key motivation for our research initiative is to highlight the importance of causal analysis using computational methods, and focus on understanding the opportunities of big data and algorithmic decision-making to address racial inequities and promote racial justice — beyond de-biasing algorithms. The intent is to also codify methodologies on equity-informed research practices and produce tools that are clear on the quantifiable expected social costs and benefits, as well as on the downstream effects on systemic racism more broadly.

    Q: What are some ways that the ICSR might conduct or follow-up on research seeking real-world impact or policy change?

    A: This type of research has ethical and societal considerations at its core, especially as they pertain to historically disadvantaged groups in the U.S., and will be coordinated with and communicated to local stakeholders to drive relevant policy decisions. This initiative intends to establish connections to URM [underrepresented minority] researchers and students at underrepresented universities and to directly collaborate with them on these research efforts. To that effect, we are leveraging existing programs such as the MIT Summer Research Program (MSRP).

    To ensure that our research targets the right problems bringing a racial equity lens with an interest to effect policy change, we will also connect with community organizations in minority neighborhoods who often bear the brunt of the direct and indirect effects of systemic racism, as well as with local government offices that work to address inequity in service provision in these communities. Our intent is to directly engage IDSS students with these organizations to help develop and test algorithmic tools for racial equity. More

  • in

    3 Questions: What a single car can say about traffic

    Vehicle traffic has long defied description. Once measured roughly through visual inspection and traffic cameras, new smartphone crowdsourcing tools are now quantifying traffic far more precisely. This popular method, however, also presents a problem: Accurate measurements require a lot of data and users.

    Meshkat Botshekan, an MIT PhD student in civil and environmental engineering and research assistant at the MIT Concrete Sustainability Hub, has sought to expand on crowdsourcing methods by looking into the physics of traffic. During his time as a doctoral candidate, he has helped develop Carbin, a smartphone-based roadway crowdsourcing tool created by MIT CSHub and the University of Massachusetts Dartmouth, and used its data to offer more insight into the physics of traffic — from the formation of traffic jams to the inference of traffic phase and driving behavior. Here, he explains how recent findings can allow smartphones to infer traffic properties from the measurements of a single vehicle.  

    Q: Numerous navigation apps already measure traffic. Why do we need alternatives?

    A: Traffic characteristics have always been tough to measure. In the past, visual inspection and cameras were used to produce traffic metrics. So, there’s no denying that today’s navigation tools apps offer a superior alternative. Yet even these modern tools have gaps.

    Chief among them is their dependence on spatially distributed user counts: Essentially, these apps tally up their users on road segments to estimate the density of traffic. While this approach may seem adequate, it is both vulnerable to manipulation, as demonstrated in some viral videos, and requires immense quantities of data for reliable estimates. Processing these data is so time- and resource-intensive that, despite their availability, they can’t be used to quantify traffic effectively across a whole road network. As a result, this immense quantity of traffic data isn’t actually optimal for traffic management.

    Q: How could new technologies improve how we measure traffic?

    A: New alternatives have the potential to offer two improvements over existing methods: First, they can extrapolate far more about traffic with far fewer data. Second, they can cost a fraction of the price while offering a far simpler method of data collection. Just like Waze and Google Maps, they rely on crowdsourcing data from users. Yet, they are grounded in the incorporation of high-level statistical physics into data analysis.

    For instance, the Carbin app, which we are developing in collaboration with UMass Dartmouth, applies principles of statistical physics to existing traffic models to entirely forgo the need for user counts. Instead, it can infer traffic density and driver behavior using the input of a smartphone mounted in single vehicle.

    The method at the heart of the app, which was published last fall in Physical Review E, treats vehicles like particles in a many-body system. Just as the behavior of a closed many-body system can be understood through observing the behavior of an individual particle relying on the ergodic theorem of statistical physics, we can characterize traffic through the fluctuations in speed and position of a single vehicle across a road. As a result, we can infer the behavior and density of traffic on a segment of a road.

    As far less data is required, this method is more rapid and makes data management more manageable. But most importantly, it also has the potential to make traffic data less expensive and accessible to those that need it.

    Q: Who are some of the parties that would benefit from new technologies?

    A: More accessible and sophisticated traffic data would benefit more than just drivers seeking smoother, faster routes. It would also enable state and city departments of transportation (DOTs) to make local and collective interventions that advance the critical transportation objectives of equity, safety, and sustainability.

    As a safety solution, new data collection technologies could pinpoint dangerous driving conditions on a much finer scale to inform improved traffic calming measures. And since socially vulnerable communities experience traffic violence disproportionately, these interventions would have the added benefit of addressing pressing equity concerns. 

    There would also be an environmental benefit. DOTs could mitigate vehicle emissions by identifying minute deviations in traffic flow. This would present them with more opportunities to mitigate the idling and congestion that generate excess fuel consumption.  

    As we’ve seen, these three challenges have become increasingly acute, especially in urban areas. Yet, the data needed to address them exists already — and is being gathered by smartphones and telematics devices all over the world. So, to ensure a safer, more sustainable road network, it will be crucial to incorporate these data collection methods into our decision-making. More