More stories

  • in

    Bringing the social and ethical responsibilities of computing to the forefront

    There has been a remarkable surge in the use of algorithms and artificial intelligence to address a wide range of problems and challenges. While their adoption, particularly with the rise of AI, is reshaping nearly every industry sector, discipline, and area of research, such innovations often expose unexpected consequences that involve new norms, new expectations, and new rules and laws.

    To facilitate deeper understanding, the Social and Ethical Responsibilities of Computing (SERC), a cross-cutting initiative in the MIT Schwarzman College of Computing, recently brought together social scientists and humanists with computer scientists, engineers, and other computing faculty for an exploration of the ways in which the broad applicability of algorithms and AI has presented both opportunities and challenges in many aspects of society.

    “The very nature of our reality is changing. AI has the ability to do things that until recently were solely the realm of human intelligence — things that can challenge our understanding of what it means to be human,” remarked Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing, in his opening address at the inaugural SERC Symposium. “This poses philosophical, conceptual, and practical questions on a scale not experienced since the start of the Enlightenment. In the face of such profound change, we need new conceptual maps for navigating the change.”

    The symposium offered a glimpse into the vision and activities of SERC in both research and education. “We believe our responsibility with SERC is to educate and equip our students and enable our faculty to contribute to responsible technology development and deployment,” said Georgia Perakis, the William F. Pounds Professor of Management in the MIT Sloan School of Management, co-associate dean of SERC, and the lead organizer of the symposium. “We’re drawing from the many strengths and diversity of disciplines across MIT and beyond and bringing them together to gain multiple viewpoints.”

    Through a succession of panels and sessions, the symposium delved into a variety of topics related to the societal and ethical dimensions of computing. In addition, 37 undergraduate and graduate students from a range of majors, including urban studies and planning, political science, mathematics, biology, electrical engineering and computer science, and brain and cognitive sciences, participated in a poster session to exhibit their research in this space, covering such topics as quantum ethics, AI collusion in storage markets, computing waste, and empowering users on social platforms for better content credibility.

    Showcasing a diversity of work

    In three sessions devoted to themes of beneficent and fair computing, equitable and personalized health, and algorithms and humans, the SERC Symposium showcased work by 12 faculty members across these domains.

    One such project from a multidisciplinary team of archaeologists, architects, digital artists, and computational social scientists aimed to preserve endangered heritage sites in Afghanistan with digital twins. The project team produced highly detailed interrogable 3D models of the heritage sites, in addition to extended reality and virtual reality experiences, as learning resources for audiences that cannot access these sites.

    In a project for the United Network for Organ Sharing, researchers showed how they used applied analytics to optimize various facets of an organ allocation system in the United States that is currently undergoing a major overhaul in order to make it more efficient, equitable, and inclusive for different racial, age, and gender groups, among others.

    Another talk discussed an area that has not yet received adequate public attention: the broader implications for equity that biased sensor data holds for the next generation of models in computing and health care.

    A talk on bias in algorithms considered both human bias and algorithmic bias, and the potential for improving results by taking into account differences in the nature of the two kinds of bias.

    Other highlighted research included the interaction between online platforms and human psychology; a study on whether decision-makers make systemic prediction mistakes on the available information; and an illustration of how advanced analytics and computation can be leveraged to inform supply chain management, operations, and regulatory work in the food and pharmaceutical industries.

    Improving the algorithms of tomorrow

    “Algorithms are, without question, impacting every aspect of our lives,” said Asu Ozdaglar, deputy dean of academics for the MIT Schwarzman College of Computing and head of the Department of Electrical Engineering and Computer Science, in kicking off a panel she moderated on the implications of data and algorithms.

    “Whether it’s in the context of social media, online commerce, automated tasks, and now a much wider range of creative interactions with the advent of generative AI tools and large language models, there’s little doubt that much more is to come,” Ozdaglar said. “While the promise is evident to all of us, there’s a lot to be concerned as well. This is very much time for imaginative thinking and careful deliberation to improve the algorithms of tomorrow.”

    Turning to the panel, Ozdaglar asked experts from computing, social science, and data science for insights on how to understand what is to come and shape it to enrich outcomes for the majority of humanity.

    Sarah Williams, associate professor of technology and urban planning at MIT, emphasized the critical importance of comprehending the process of how datasets are assembled, as data are the foundation for all models. She also stressed the need for research to address the potential implication of biases in algorithms that often find their way in through their creators and the data used in their development. “It’s up to us to think about our own ethical solutions to these problems,” she said. “Just as it’s important to progress with the technology, we need to start the field of looking at these questions of what biases are in the algorithms? What biases are in the data, or in that data’s journey?”

    Shifting focus to generative models and whether the development and use of these technologies should be regulated, the panelists — which also included MIT’s Srini Devadas, professor of electrical engineering and computer science, John Horton, professor of information technology, and Simon Johnson, professor of entrepreneurship — all concurred that regulating open-source algorithms, which are publicly accessible, would be difficult given that regulators are still catching up and struggling to even set guardrails for technology that is now 20 years old.

    Returning to the question of how to effectively regulate the use of these technologies, Johnson proposed a progressive corporate tax system as a potential solution. He recommends basing companies’ tax payments on their profits, especially for large corporations whose massive earnings go largely untaxed due to offshore banking. By doing so, Johnson said that this approach can serve as a regulatory mechanism that discourages companies from trying to “own the entire world” by imposing disincentives.

    The role of ethics in computing education

    As computing continues to advance with no signs of slowing down, it is critical to educate students to be intentional in the social impact of the technologies they will be developing and deploying into the world. But can one actually be taught such things? If so, how?

    Caspar Hare, professor of philosophy at MIT and co-associate dean of SERC, posed this looming question to faculty on a panel he moderated on the role of ethics in computing education. All experienced in teaching ethics and thinking about the social implications of computing, each panelist shared their perspective and approach.

    A strong advocate for the importance of learning from history, Eden Medina, associate professor of science, technology, and society at MIT, said that “often the way we frame computing is that everything is new. One of the things that I do in my teaching is look at how people have confronted these issues in the past and try to draw from them as a way to think about possible ways forward.” Medina regularly uses case studies in her classes and referred to a paper written by Yale University science historian Joanna Radin on the Pima Indian Diabetes Dataset that raised ethical issues on the history of that particular collection of data that many don’t consider as an example of how decisions around technology and data can grow out of very specific contexts.

    Milo Phillips-Brown, associate professor of philosophy at Oxford University, talked about the Ethical Computing Protocol that he co-created while he was a SERC postdoc at MIT. The protocol, a four-step approach to building technology responsibly, is designed to train computer science students to think in a better and more accurate way about the social implications of technology by breaking the process down into more manageable steps. “The basic approach that we take very much draws on the fields of value-sensitive design, responsible research and innovation, participatory design as guiding insights, and then is also fundamentally interdisciplinary,” he said.

    Fields such as biomedicine and law have an ethics ecosystem that distributes the function of ethical reasoning in these areas. Oversight and regulation are provided to guide front-line stakeholders and decision-makers when issues arise, as are training programs and access to interdisciplinary expertise that they can draw from. “In this space, we have none of that,” said John Basl, associate professor of philosophy at Northeastern University. “For current generations of computer scientists and other decision-makers, we’re actually making them do the ethical reasoning on their own.” Basl commented further that teaching core ethical reasoning skills across the curriculum, not just in philosophy classes, is essential, and that the goal shouldn’t be for every computer scientist be a professional ethicist, but for them to know enough of the landscape to be able to ask the right questions and seek out the relevant expertise and resources that exists.

    After the final session, interdisciplinary groups of faculty, students, and researchers engaged in animated discussions related to the issues covered throughout the day during a reception that marked the conclusion of the symposium. More

  • in

    Exploring new methods for increasing safety and reliability of autonomous vehicles

    When we think of getting on the road in our cars, our first thoughts may not be that fellow drivers are particularly safe or careful — but human drivers are more reliable than one may expect. For each fatal car crash in the United States, motor vehicles log a whopping hundred million miles on the road.

    Human reliability also plays a role in how autonomous vehicles are integrated in the traffic system, especially around safety considerations. Human drivers continue to surpass autonomous vehicles in their ability to make quick decisions and perceive complex environments: Autonomous vehicles are known to struggle with seemingly common tasks, such as taking on- or off-ramps, or turning left in the face of oncoming traffic. Despite these enormous challenges, embracing autonomous vehicles in the future could yield great benefits, like clearing congested highways; enhancing freedom and mobility for non-drivers; and boosting driving efficiency, an important piece in fighting climate change.

    MIT engineer Cathy Wu envisions ways that autonomous vehicles could be deployed with their current shortcomings, without experiencing a dip in safety. “I started thinking more about the bottlenecks. It’s very clear that the main barrier to deployment of autonomous vehicles is safety and reliability,” Wu says.

    One path forward may be to introduce a hybrid system, in which autonomous vehicles handle easier scenarios on their own, like cruising on the highway, while transferring more complicated maneuvers to remote human operators. Wu, who is a member of the Laboratory for Information and Decision Systems (LIDS), a Gilbert W. Winslow Assistant Professor of Civil and Environmental Engineering (CEE) and a member of the MIT Institute for Data, Systems, and Society (IDSS), likens this approach to air traffic controllers on the ground directing commercial aircraft.

    In a paper published April 12 in IEEE Transactions on Robotics, Wu and co-authors Cameron Hickert and Sirui Li (both graduate students at LIDS) introduced a framework for how remote human supervision could be scaled to make a hybrid system efficient without compromising passenger safety. They noted that if autonomous vehicles were able to coordinate with each other on the road, they could reduce the number of moments in which humans needed to intervene.

    Humans and cars: finding a balance that’s just right

    For the project, Wu, Hickert, and Li sought to tackle a maneuver that autonomous vehicles often struggle to complete. They decided to focus on merging, specifically when vehicles use an on-ramp to enter a highway. In real life, merging cars must accelerate or slow down in order to avoid crashing into cars already on the road. In this scenario, if an autonomous vehicle was about to merge into traffic, remote human supervisors could momentarily take control of the vehicle to ensure a safe merge. In order to evaluate the efficiency of such a system, particularly while guaranteeing safety, the team specified the maximum amount of time each human supervisor would be expected to spend on a single merge. They were interested in understanding whether a small number of remote human supervisors could successfully manage a larger group of autonomous vehicles, and the extent to which this human-to-car ratio could be improved while still safely covering every merge.

    With more autonomous vehicles in use, one might assume a need for more remote supervisors. But in scenarios where autonomous vehicles coordinated with each other, the team found that cars could significantly reduce the number of times humans needed to step in. For example, a coordinating autonomous vehicle already on a highway could adjust its speed to make room for a merging car, eliminating a risky merging situation altogether.

    The team substantiated the potential to safely scale remote supervision in two theorems. First, using a mathematical framework known as queuing theory, the researchers formulated an expression to capture the probability of a given number of supervisors failing to handle all merges pooled together from multiple cars. This way, the researchers were able to assess how many remote supervisors would be needed in order to cover every potential merge conflict, depending on the number of autonomous vehicles in use. The researchers derived a second theorem to quantify the influence of cooperative autonomous vehicles on surrounding traffic for boosting reliability, to assist cars attempting to merge.

    When the team modeled a scenario in which 30 percent of cars on the road were cooperative autonomous vehicles, they estimated that a ratio of one human supervisor to every 47 autonomous vehicles could cover 99.9999 percent of merging cases. But this level of coverage drops below 99 percent, an unacceptable range, in scenarios where autonomous vehicles did not cooperate with each other.

    “If vehicles were to coordinate and basically prevent the need for supervision, that’s actually the best way to improve reliability,” Wu says.

    Cruising toward the future

    The team decided to focus on merging not only because it’s a challenge for autonomous vehicles, but also because it’s a well-defined task associated with a less-daunting scenario: driving on the highway. About half of the total miles traveled in the United States occur on interstates and other freeways. Since highways allow higher speeds than city roads, Wu says, “If you can fully automate highway driving … you give people back about a third of their driving time.”

    If it became feasible for autonomous vehicles to cruise unsupervised for most highway driving, the challenge of safely navigating complex or unexpected moments would remain. For instance, “you [would] need to be able to handle the start and end of the highway driving,” Wu says. You would also need to be able to manage times when passengers zone out or fall asleep, making them unable to quickly take over controls should it be needed. But if remote human supervisors could guide autonomous vehicles at key moments, passengers may never have to touch the wheel. Besides merging, other challenging situations on the highway include changing lanes and overtaking slower cars on the road.

    Although remote supervision and coordinated autonomous vehicles are hypotheticals for high-speed operations, and not currently in use, Wu hopes that thinking about these topics can encourage growth in the field.

    “This gives us some more confidence that the autonomous driving experience can happen,” Wu says. “I think we need to be more creative about what we mean by ‘autonomous vehicles.’ We want to give people back their time — safely. We want the benefits, we don’t strictly want something that drives autonomously.” More

  • in

    Architectural heritage like you haven’t seen it before

    The shrine of Khwaja Abu Nasr Parsa is a spectacular mosque in Balkh, Afghanistan. Also known as the “Green Mosque” due to the brilliant color of its tiled and painted dome, the intricately decorated building dates to the 16th century.

    If it were more accessible, the Green Mosque would attract many visitors. But Balkh is located in northern Afghanistan, roughly 50 miles from the border with Uzbekistan, and few outsiders will ever reach it. Still, anyone can now get a vivid sense of the mosque thanks to MIT’s new “Ways of Seeing” project, an innovative form of historic preservation.

    Play video

    PHD student Nikolaos Vlavianos created the following Extended Reality sequences for the “Ways of Seeing” project.

    “Ways of Seeing” uses multiple modes of imagery to produce a rich visual record of four historic building sites in Afghanistan — including colorful 3D still images, virtual reality imagery that takes viewers around and in some cases inside the structures, and exquisite hand-drawn architectural renderings of the buildings. The project’s imagery will be made available for viewing through the MIT Libraries by the end of June, with open access for the public. A subset of curated project materials will also be available through Archnet, an open access resource on the built environment of Muslim societies, which is a collaboration between the Aga Khan Documentation Center of the MIT Libraries and the Aga Khan Trust for Culture.

    “After the U.S. withdrawal from Afghanistan in August 2021, Associate Provost Richard Lester convened a set of MIT faculty in a working group to think of what we as a community of scholars could be doing that would be meaningful to people in Afghanistan at this point in time,” says Fotini Christia, an MIT political science professor who led the project. “‘Ways of Seeing’ is a project that I conceived after discussions with that group of colleagues and which is truly in the MIT tradition: It combines field data, technology, and art to protect heritage and serve the world.”

    Christia, the Ford International Professor of the Social Sciences and director of the Sociotechnical Systems Research Center at the MIT Schwarzman College of Computing, has worked extensively in Afghanistan conducting field research about civil society. She viewed this project as a unique opportunity to construct a detailed, accessible record of remarkable heritage sites — through sophisticated digital elements as well as finely wrought ink drawings.

    “The idea is these drawings would inspire interest and pride in this heritage, a kind of amazement and motivation to preserve this for as long as humanly possible,” says Jelena Pejkovic MArch ’06, a practicing architect who made the large-scale renderings by hand over a period of months.

    Pejkovic adds: “These drawings are extremely time-consuming, and for me this is part of the motivation. They ask you to slow down and pay attention. What can you take in from all this material that we have collected? How do you take time to look, to interpret, to understand what is in front of you?”

    The project’s “digital transformation strategy” was led by Nikolaos Vlavianos, a PhD candidate in the Department of Architecture’s Design and Computation group. The group uses cutting-edge technologies and drones to make three-dimensional digital reconstructions of large-scale architectural sites and create immersive experiences in extended reality (XR). Vlavianos also conducts studies of the psychological and physiological responses of humans experiencing such spaces in XR and in person. 

    “I regard this project as an effort toward a broader architectural metaverse consisting of immersive experiences in XR of physical spaces around the world that are difficult or impossible to access due to political, social, and even cultural constraints,” says Vlavianos. “These spaces in the metaverse are information hubs promoting an embodied experiential approach of living, sensing, seeing, hearing, and touching.”

    Nasser Rabbat, the Aga Khan Professor and director of the Aga Khan Program for Islamic Architecture at MIT, also offered advice and guidance on the early stages of the project.

    The project — formally titled “Ways of Seeing: Documenting Endangered Built Heritage in Afghanistan” — encompasses imaging of four quite varied historical sites in Afghanistan.

    These are the Green Mosque in Balkh; the Parwan Stupa, a Buddhist dome south of Kabul; the tomb of Gawhar Saad, in Herat, in honor of the queen of the emperor of the Timurid, who was herself a highly influential figure in the 14th and 15th centuries; and the Minaret of Jam, a remarkable 200-foot tall tower dating to the 12th century, next to the Hari River in a distant spot in western Afghanistan.

    The sites thus encompass multiple religions and a diversity of building types. Many are in remote locations within Afghanistan that cannot readily be accessed by visitors — including scholars.

    “Ways of Seeing” is supported by a Mellon Faculty Grant from the MIT Center for Art, Science, and Technology (CAST), and by faculty funding from the MIT School of Humanities, Arts, and Social Sciences (SHASS). It is co-presented with the Institute for Data, Systems, and Society (IDSS), the Sociotechnical Systems Research Center (SSRC) at the MIT Schwarzman College of Computing, the MIT Department of Political Science, and SHASS.

    Two students from Wellesley College participating in MIT’s Undergraduate Research Opportunities Program (UROP), juniors Meng Lu and Muzi Fang, also worked on the project under the guidance of Vlavianos to create a video game for children involving the Gawhar Saad heritage site. 

    To generate the imagery, the MIT team worked with an Afghan digital production team that was on the ground in the country; they went to the four sites and took thousands of pictures, having been trained remotely by Vlavianos to perform a 3D scanning aerial operation. They were led by Shafic Gawhari, the managing director for Afghanistan at the Moby Group, an international media enterprise; others involved were Mohammad Jan Kamal, Nazifullah Benaam, Warekzai Ghayoor, Rahm Ali Mohebzada, Mohammad Harif Ghobar, and Abdul Musawer Anwari.

    The journalists documented the sites by collecting 15,000 to 30,000 images, while Vlavianos computationally generated point clouds and mesh geometry with detailed texture mapping. The outcome of those models consisted of still images,  immersive experiences in XR, and data for Pejkovic.  

    “‘Ways of Seeing’ proposes a hybrid model of remote data collection,” says Vlavianos, who in his time at MIT has also led similar projects at Machu Picchu in Peru, and the Simonos Petra monastery at Mount Athos, Greece. To produce similar imagery even more easily, he says, “The next step — which I am working on — is to utilize autonomous drones deployed simultaneously in various locations on the world for rapid production and advanced neural network algorithms to generate models from lower number of images.”  

    In the future, Vlavianos envisions documenting and reconstructing other sites around the world using crowdsourcing data, historical images, satellite imagery, or even by having local communities learn XR techniques. 

    Pejkovic produced her drawings based on the digital models assembled by Vlavianos, carefully using a traditional rendering technique in which she would first outline the measurements of each structure, at scale, and then gradually ink in the drawings to give the buildings texture. The inking technique she used is based on VERNADOC, a method of documenting vernacular architecture developed by the Finnish architect Markku Mattila.

    “I wanted to rediscover the most traditional possible kind of documentation — measuring directly by hand, and drawing by hand,” says Pejkovic. She has been active in conservation of cultural heritage for over 10 years.

    The first time Pejkovic ever saw this type of hand-drawn renderings in person, she recalls thinking, “This is not possible, a human being cannot make drawings like this.” However, she wryly adds, “You know the motto at MIT is ‘mens et manus,’ mind and hand.” And so she embarked on hand drawing these renderings herself, at a large scale — her image of the Minaret of Jam has been printed in a crisp 8-foot version by the MIT team.

    “The ultimate intent of this project has been to make all these outputs, which are co-owned with the Afghans who carried out the data collection on the ground, available to Afghan refugees displaced around the world but also accessible to anyone keen to witness them,” Christia says. “The digital twins [representations] of these sites are also meant to work as repositories of information for any future preservation efforts. This model can be replicated and scaled for other heritage sites at risk from wars, environmental disaster, or cultural appropriation.” More

  • in

    Joining the battle against health care bias

    Medical researchers are awash in a tsunami of clinical data. But we need major changes in how we gather, share, and apply this data to bring its benefits to all, says Leo Anthony Celi, principal research scientist at the MIT Laboratory for Computational Physiology (LCP). 

    One key change is to make clinical data of all kinds openly available, with the proper privacy safeguards, says Celi, a practicing intensive care unit (ICU) physician at the Beth Israel Deaconess Medical Center (BIDMC) in Boston. Another key is to fully exploit these open data with multidisciplinary collaborations among clinicians, academic investigators, and industry. A third key is to focus on the varying needs of populations across every country, and to empower the experts there to drive advances in treatment, says Celi, who is also an associate professor at Harvard Medical School. 

    In all of this work, researchers must actively seek to overcome the perennial problem of bias in understanding and applying medical knowledge. This deeply damaging problem is only heightened with the massive onslaught of machine learning and other artificial intelligence technologies. “Computers will pick up all our unconscious, implicit biases when we make decisions,” Celi warns.

    Play video

    Sharing medical data 

    Founded by the LCP, the MIT Critical Data consortium builds communities across disciplines to leverage the data that are routinely collected in the process of ICU care to understand health and disease better. “We connect people and align incentives,” Celi says. “In order to advance, hospitals need to work with universities, who need to work with industry partners, who need access to clinicians and data.” 

    The consortium’s flagship project is the MIMIC (medical information marked for intensive care) ICU database built at BIDMC. With about 35,000 users around the world, the MIMIC cohort is the most widely analyzed in critical care medicine. 

    International collaborations such as MIMIC highlight one of the biggest obstacles in health care: most clinical research is performed in rich countries, typically with most clinical trial participants being white males. “The findings of these trials are translated into treatment recommendations for every patient around the world,” says Celi. “We think that this is a major contributor to the sub-optimal outcomes that we see in the treatment of all sorts of diseases in Africa, in Asia, in Latin America.” 

    To fix this problem, “groups who are disproportionately burdened by disease should be setting the research agenda,” Celi says. 

    That’s the rule in the “datathons” (health hackathons) that MIT Critical Data has organized in more than two dozen countries, which apply the latest data science techniques to real-world health data. At the datathons, MIT students and faculty both learn from local experts and share their own skill sets. Many of these several-day events are sponsored by the MIT Industrial Liaison Program, the MIT International Science and Technology Initiatives program, or the MIT Sloan Latin America Office. 

    Datathons are typically held in that country’s national language or dialect, rather than English, with representation from academia, industry, government, and other stakeholders. Doctors, nurses, pharmacists, and social workers join up with computer science, engineering, and humanities students to brainstorm and analyze potential solutions. “They need each other’s expertise to fully leverage and discover and validate the knowledge that is encrypted in the data, and that will be translated into the way they deliver care,” says Celi. 

    “Everywhere we go, there is incredible talent that is completely capable of designing solutions to their health-care problems,” he emphasizes. The datathons aim to further empower the professionals and students in the host countries to drive medical research, innovation, and entrepreneurship.

    Play video

    Fighting built-in bias 

    Applying machine learning and other advanced data science techniques to medical data reveals that “bias exists in the data in unimaginable ways” in every type of health product, Celi says. Often this bias is rooted in the clinical trials required to approve medical devices and therapies. 

    One dramatic example comes from pulse oximeters, which provide readouts on oxygen levels in a patient’s blood. It turns out that these devices overestimate oxygen levels for people of color. “We have been under-treating individuals of color because the nurses and the doctors have been falsely assured that their patients have adequate oxygenation,” he says. “We think that we have harmed, if not killed, a lot of individuals in the past, especially during Covid, as a result of a technology that was not designed with inclusive test subjects.” 

    Such dangers only increase as the universe of medical data expands. “The data that we have available now for research is maybe two or three levels of magnitude more than what we had even 10 years ago,” Celi says. MIMIC, for example, now includes terabytes of X-ray, echocardiogram, and electrocardiogram data, all linked with related health records. Such enormous sets of data allow investigators to detect health patterns that were previously invisible. 

    “But there is a caveat,” Celi says. “It is trivial for computers to learn sensitive attributes that are not very obvious to human experts.” In a study released last year, for instance, he and his colleagues showed that algorithms can tell if a chest X-ray image belongs to a white patient or person of color, even without looking at any other clinical data. 

    “More concerningly, groups including ours have demonstrated that computers can learn easily if you’re rich or poor, just from your imaging alone,” Celi says. “We were able to train a computer to predict if you are on Medicaid, or if you have private insurance, if you feed them with chest X-rays without any abnormality. So again, computers are catching features that are not visible to the human eye.” And these features may lead algorithms to advise against therapies for people who are Black or poor, he says. 

    Opening up industry opportunities 

    Every stakeholder stands to benefit when pharmaceutical firms and other health-care corporations better understand societal needs and can target their treatments appropriately, Celi says. 

    “We need to bring to the table the vendors of electronic health records and the medical device manufacturers, as well as the pharmaceutical companies,” he explains. “They need to be more aware of the disparities in the way that they perform their research. They need to have more investigators representing underrepresented groups of people, to provide that lens to come up with better designs of health products.” 

    Corporations could benefit by sharing results from their clinical trials, and could immediately see these potential benefits by participating in datathons, Celi says. “They could really witness the magic that happens when that data is curated and analyzed by students and clinicians with different backgrounds from different countries. So we’re calling out our partners in the pharmaceutical industry to organize these events with us!”  More

  • in

    Study: AI models fail to reproduce human judgements about rule violations

    In an effort to improve fairness or reduce backlogs, machine-learning models are sometimes designed to mimic human decision making, such as deciding whether social media posts violate toxic content policies.

    But researchers from MIT and elsewhere have found that these models often do not replicate human decisions about rule violations. If models are not trained with the right data, they are likely to make different, often harsher judgements than humans would.

    In this case, the “right” data are those that have been labeled by humans who were explicitly asked whether items defy a certain rule. Training involves showing a machine-learning model millions of examples of this “normative data” so it can learn a task.

    But data used to train machine-learning models are typically labeled descriptively — meaning humans are asked to identify factual features, such as, say, the presence of fried food in a photo. If “descriptive data” are used to train models that judge rule violations, such as whether a meal violates a school policy that prohibits fried food, the models tend to over-predict rule violations.

    This drop in accuracy could have serious implications in the real world. For instance, if a descriptive model is used to make decisions about whether an individual is likely to reoffend, the researchers’ findings suggest it may cast stricter judgements than a human would, which could lead to higher bail amounts or longer criminal sentences.

    “I think most artificial intelligence/machine-learning researchers assume that the human judgements in data and labels are biased, but this result is saying something worse. These models are not even reproducing already-biased human judgments because the data they’re being trained on has a flaw: Humans would label the features of images and text differently if they knew those features would be used for a judgment. This has huge ramifications for machine learning systems in human processes,” says Marzyeh Ghassemi, an assistant professor and head of the Healthy ML Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL).

    Ghassemi is senior author of a new paper detailing these findings, which was published today in Science Advances. Joining her on the paper are lead author Aparna Balagopalan, an electrical engineering and computer science graduate student; David Madras, a graduate student at the University of Toronto; David H. Yang, a former graduate student who is now co-founder of ML Estimation; Dylan Hadfield-Menell, an MIT assistant professor; and Gillian K. Hadfield, Schwartz Reisman Chair in Technology and Society and professor of law at the University of Toronto.

    Labeling discrepancy

    This study grew out of a different project that explored how a machine-learning model can justify its predictions. As they gathered data for that study, the researchers noticed that humans sometimes give different answers if they are asked to provide descriptive or normative labels about the same data.

    To gather descriptive labels, researchers ask labelers to identify factual features — does this text contain obscene language? To gather normative labels, researchers give labelers a rule and ask if the data violates that rule — does this text violate the platform’s explicit language policy?

    Surprised by this finding, the researchers launched a user study to dig deeper. They gathered four datasets to mimic different policies, such as a dataset of dog images that could be in violation of an apartment’s rule against aggressive breeds. Then they asked groups of participants to provide descriptive or normative labels.

    In each case, the descriptive labelers were asked to indicate whether three factual features were present in the image or text, such as whether the dog appears aggressive. Their responses were then used to craft judgements. (If a user said a photo contained an aggressive dog, then the policy was violated.) The labelers did not know the pet policy. On the other hand, normative labelers were given the policy prohibiting aggressive dogs, and then asked whether it had been violated by each image, and why.

    The researchers found that humans were significantly more likely to label an object as a violation in the descriptive setting. The disparity, which they computed using the absolute difference in labels on average, ranged from 8 percent on a dataset of images used to judge dress code violations to 20 percent for the dog images.

    “While we didn’t explicitly test why this happens, one hypothesis is that maybe how people think about rule violations is different from how they think about descriptive data. Generally, normative decisions are more lenient,” Balagopalan says.

    Yet data are usually gathered with descriptive labels to train a model for a particular machine-learning task. These data are often repurposed later to train different models that perform normative judgements, like rule violations.

    Training troubles

    To study the potential impacts of repurposing descriptive data, the researchers trained two models to judge rule violations using one of their four data settings. They trained one model using descriptive data and the other using normative data, and then compared their performance.

    They found that if descriptive data are used to train a model, it will underperform a model trained to perform the same judgements using normative data. Specifically, the descriptive model is more likely to misclassify inputs by falsely predicting a rule violation. And the descriptive model’s accuracy was even lower when classifying objects that human labelers disagreed about.

    “This shows that the data do really matter. It is important to match the training context to the deployment context if you are training models to detect if a rule has been violated,” Balagopalan says.

    It can be very difficult for users to determine how data have been gathered; this information can be buried in the appendix of a research paper or not revealed by a private company, Ghassemi says.

    Improving dataset transparency is one way this problem could be mitigated. If researchers know how data were gathered, then they know how those data should be used. Another possible strategy is to fine-tune a descriptively trained model on a small amount of normative data. This idea, known as transfer learning, is something the researchers want to explore in future work.

    They also want to conduct a similar study with expert labelers, like doctors or lawyers, to see if it leads to the same label disparity.

    “The way to fix this is to transparently acknowledge that if we want to reproduce human judgment, we must only use data that were collected in that setting. Otherwise, we are going to end up with systems that are going to have extremely harsh moderations, much harsher than what humans would do. Humans would see nuance or make another distinction, whereas these models don’t,” Ghassemi says.

    This research was funded, in part, by the Schwartz Reisman Institute for Technology and Society, Microsoft Research, the Vector Institute, and a Canada Research Council Chain. More

  • in

    Minimizing electric vehicles’ impact on the grid

    National and global plans to combat climate change include increasing the electrification of vehicles and the percentage of electricity generated from renewable sources. But some projections show that these trends might require costly new power plants to meet peak loads in the evening when cars are plugged in after the workday. What’s more, overproduction of power from solar farms during the daytime can waste valuable electricity-generation capacity.

    In a new study, MIT researchers have found that it’s possible to mitigate or eliminate both these problems without the need for advanced technological systems of connected devices and real-time communications, which could add to costs and energy consumption. Instead, encouraging the placing of charging stations for electric vehicles (EVs) in strategic ways, rather than letting them spring up anywhere, and setting up systems to initiate car charging at delayed times could potentially make all the difference.

    The study, published today in the journal Cell Reports Physical Science, is by Zachary Needell PhD ’22, postdoc Wei Wei, and Professor Jessika Trancik of MIT’s Institute for Data, Systems, and Society.

    In their analysis, the researchers used data collected in two sample cities: New York and Dallas. The data were gathered from, among other sources, anonymized records collected via onboard devices in vehicles, and surveys that carefully sampled populations to cover variable travel behaviors. They showed the times of day cars are used and for how long, and how much time the vehicles spend at different kinds of locations — residential, workplace, shopping, entertainment, and so on.

    The findings, Trancik says, “round out the picture on the question of where to strategically locate chargers to support EV adoption and also support the power grid.”

    Better availability of charging stations at workplaces, for example, could help to soak up peak power being produced at midday from solar power installations, which might otherwise go to waste because it is not economical to build enough battery or other storage capacity to save all of it for later in the day. Thus, workplace chargers can provide a double benefit, helping to reduce the evening peak load from EV charging and also making use of the solar electricity output.

    These effects on the electric power system are considerable, especially if the system must meet charging demands for a fully electrified personal vehicle fleet alongside the peaks in other demand for electricity, for example on the hottest days of the year. If unmitigated, the evening peaks in EV charging demand could require installing upwards of 20 percent more power-generation capacity, the researchers say.

    “Slow workplace charging can be more preferable than faster charging technologies for enabling a higher utilization of midday solar resources,” Wei says.

    Meanwhile, with delayed home charging, each EV charger could be accompanied by a simple app to estimate the time to begin its charging cycle so that it charges just before it is needed the next day. Unlike other proposals that require a centralized control of the charging cycle, such a system needs no interdevice communication of information and can be preprogrammed — and can accomplish a major shift in the demand on the grid caused by increasing EV penetration. The reason it works so well, Trancik says, is because of the natural variability in driving behaviors across individuals in a population.

    By “home charging,” the researchers aren’t only referring to charging equipment in individual garages or parking areas. They say it’s essential to make charging stations available in on-street parking locations and in apartment building parking areas as well.

    Trancik says the findings highlight the value of combining the two measures — workplace charging and delayed home charging — to reduce peak electricity demand, store solar energy, and conveniently meet drivers’ charging needs on all days. As the team showed in earlier research, home charging can be a particularly effective component of a strategic package of charging locations; workplace charging, they have found, is not a good substitute for home charging for meeting drivers’ needs on all days.

    “Given that there’s a lot of public money going into expanding charging infrastructure,” Trancik says, “how do you incentivize the location such that this is going to be efficiently and effectively integrated into the power grid without requiring a lot of additional capacity expansion?” This research offers some guidance to policymakers on where to focus rules and incentives.

    “I think one of the fascinating things about these findings is that by being strategic you can avoid a lot of physical infrastructure that you would otherwise need,” she adds. “Your electric vehicles can displace some of the need for stationary energy storage, and you can also avoid the need to expand the capacity of power plants, by thinking about the location of chargers as a tool for managing demands — where they occur and when they occur.”

    Delayed home charging could make a surprising amount of difference, the team found. “It’s basically incentivizing people to begin charging later. This can be something that is preprogrammed into your chargers. You incentivize people to delay the onset of charging by a bit, so that not everyone is charging at the same time, and that smooths out the peak.”

    Such a program would require some advance commitment on the part of participants. “You would need to have enough people committing to this program in advance to avoid the investment in physical infrastructure,” Trancik says. “So, if you have enough people signing up, then you essentially don’t have to build those extra power plants.”

    It’s not a given that all of this would line up just right, and putting in place the right mix of incentives would be crucial. “If you want electric vehicles to act as an effective storage technology for solar energy, then the [EV] market needs to grow fast enough in order to be able to do that,” Trancik says.

    To best use public funds to help make that happen, she says, “you can incentivize charging installations, which would go through ideally a competitive process — in the private sector, you would have companies bidding for different projects, but you can incentivize installing charging at workplaces, for example, to tap into both of these benefits.” Chargers people can access when they are parked near their residences are also important, Trancik adds, but for other reasons. Home charging is one of the ways to meet charging needs while avoiding inconvenient disruptions to people’s travel activities.

    The study was supported by the European Regional Development Fund Operational Program for Competitiveness and Internationalization, the Lisbon Portugal Regional Operation Program, and the Portuguese Foundation for Science and Technology. More

  • in

    Report: CHIPS Act just the first step in addressing threats to US leadership in advanced computing

    When Liu He, a Chinese economist, politician, and “chip czar,” was tapped to lead the charge in a chipmaking arms race with the United States, his message lingered in the air, leaving behind a dewy glaze of tension: “For our country, technology is not just for growth… it is a matter of survival.”

    Once upon a time, the United States’ early technological prowess positioned the nation to outpace foreign rivals and cultivate a competitive advantage for domestic businesses. Yet, 30 years later, America’s lead in advanced computing is continuing to wane. What happened?

    A new report from an MIT researcher and two colleagues sheds light on the decline in U.S. leadership. The scientists looked at high-level measures to examine the shrinkage: overall capabilities, supercomputers, applied algorithms, and semiconductor manufacturing. Through their analysis, they found that not only has China closed the computing gap with the U.S., but nearly 80 percent of American leaders in the field believe that their Chinese competitors are improving capabilities faster — which, the team says, suggests a “broad threat to U.S. competitiveness.”

    To delve deeply into the fray, the scientists conducted the Advanced Computing Users Survey, sampling 120 top-tier organizations, including universities, national labs, federal agencies, and industry. The team estimates that this group comprises one-third and one-half of all the most significant computing users in the United States.

    “Advanced computing is crucial to scientific improvement, economic growth and the competitiveness of U.S. companies,” says Neil Thompson, director of the FutureTech Research Project at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), who helped lead the study.

    Thompson, who is also a principal investigator at MIT’s Initiative on the Digital Economy, wrote the paper with Chad Evans, executive vice president and secretary and treasurer to the board at the Council on Competitiveness, and Daniel Armbrust, who is the co-founder, initial CEO, and member of the board of directors at Silicon Catalyst and former president of SEMATECH, the semiconductor consortium that developed industry roadmaps.

    The semiconductor, supercomputer, and algorithm bonanza

    Supercomputers — the room-sized, “giant calculators” of the hardware world — are an industry no longer dominated by the United States. Through 2015, about half of the most powerful computers were sitting firmly in the U.S., and China was growing slowly from a very slow base. But in the past six years, China has swiftly caught up, reaching near parity with America.

    This disappearing lead matters. Eighty-four percent of U.S. survey respondents said they’re computationally constrained in running essential programs. “This result was telling, given who our respondents are: the vanguard of American research enterprises and academic institutions with privileged access to advanced national supercomputing resources,” says Thompson. 

    With regards to advanced algorithms, historically, the U.S. has fronted the charge, with two-thirds of all significant improvements dominated by U.S.-born inventors. But in recent decades, U.S. dominance in algorithms has relied on bringing in foreign talent to work in the U.S., which the researchers say is now in jeopardy. China has outpaced the U.S. and many other countries in churning out PhDs in STEM fields since 2007, with one report postulating a near-distant future (2025) where China will be home to nearly twice as many PhDs than in the U.S. China’s rise in algorithms can also be seen with the “Gordon Bell Prize,” an achievement for outstanding work in harnessing the power of supercomputers in varied applications. U.S. winners historically dominated the prize, but China has now equaled or surpassed Americans’ performance in the past five years.

    While the researchers note the CHIPS and Science Act of 2022 is a critical step in re-establishing the foundation of success for advanced computing, they propose recommendations to the U.S. Office of Science and Technology Policy. 

    First, they suggest democratizing access to U.S. supercomputing by building more mid-tier systems that push boundaries for many users, as well as building tools so users scaling up computations can have less up-front resource investment. They also recommend increasing the pool of innovators by funding many more electrical engineers and computer scientists being trained with longer-term US residency incentives and scholarships. Finally, in addition to this new framework, the scientists urge taking advantage of what already exists, via providing the private sector access to experimentation with high-performance computing through supercomputing sites in academia and national labs.

    All that and a bag of chips

    Computing improvements depend on continuous advances in transistor density and performance, but creating robust, new chips necessitate a harmonious blend of design and manufacturing.

    Over the last six years, China was not known as the savants of noteworthy chips. In fact, in the past five decades, the U.S. designed most of them. But this changed in the past six years when China created the HiSilicon Kirin 9000, propelling itself to the international frontier. This success was mainly obtained through partnerships with leading global chip designers that began in the 2000s. Now, China now has 14 companies among the world’s top 50 fabless designers. A decade ago, there was only one. 

    Competitive semiconductor manufacturing has been more mixed, where U.S.-led policies and internal execution issues have slowed China’s rise, but as of July 2022, the Semiconductor Manufacturing International Corporation (SMIC) has evidence of 7 nanometer logic, which was not expected until much later. However, with extreme ultraviolet export restrictions, progress below 7 nm means domestic technology development would be expensive. Currently, China is only at parity or better in two out of 12 segments of the semiconductor supply chain. Still, with government policy and investments, the team expects a whopping increase to seven segments in 10 years. So, for the moment, the U.S. retains leadership in hardware manufacturing, but with fewer dimensions of advantage.

    The authors recommend that the White House Office of Science and Technology Policy work with key national agencies, such as the U.S. Department of Defense, U.S. Department of Energy, and the National Science Foundation, to define initiatives to build the hardware and software systems needed for important computing paradigms and workloads critical for economic and security goals. “It is crucial that American enterprises can get the benefit of faster computers,” says Thompson. “With Moore’s Law slowing down, the best way to do this is to create a portfolio of specialized chips (or “accelerators”) that are customized to our needs.”

    The scientists further believe that to lead the next generation of computing, four areas must be addressed. First, by issuing grand challenges to the CHIPS Act National Semiconductor Technology Center, researchers and startups would be motivated to invest in research and development and to seek startup capital for new technologies in areas such as spintronics, neuromorphics, optical and quantum computing, and optical interconnect fabrics. By supporting allies in passing similar acts, overall investment in these technologies would increase, and supply chains would become more aligned and secure. Establishing test beds for researchers to test algorithms on new computing architectures and hardware would provide an essential platform for innovation and discovery. Finally, planning for post-exascale systems that achieve higher levels of performance through next-generation advances would ensure that current commercial technologies don’t limit future computing systems.

    “The advanced computing landscape is in rapid flux — technologically, economically, and politically, with both new opportunities for innovation and rising global rivalries,” says Daniel Reed, Presidential Professor and professor of computer science and electrical and computer engineering at the University of Utah. “The transformational insights from both deep learning and computational modeling depend on both continued semiconductor advances and their instantiation in leading edge, large-scale computing systems — hyperscale clouds and high-performance computing systems. Although the U.S. has historically led the world in both advanced semiconductors and high-performance computing, other nations have recognized that these capabilities are integral to 21st century economic competitiveness and national security, and they are investing heavily.”

    The research was funded, in part, through Thompson’s grant from Good Ventures, which supports his FutureTech Research Group. The paper is being published by the Georgetown Public Policy Review. More

  • in

    3 Questions: Leo Anthony Celi on ChatGPT and medicine

    Launched in November 2022, ChatGPT is a chatbot that can not only engage in human-like conversation, but also provide accurate answers to questions in a wide range of knowledge domains. The chatbot, created by the firm OpenAI, is based on a family of “large language models” — algorithms that can recognize, predict, and generate text based on patterns they identify in datasets containing hundreds of millions of words.

    In a study appearing in PLOS Digital Health this week, researchers report that ChatGPT performed at or near the passing threshold of the U.S. Medical Licensing Exam (USMLE) — a comprehensive, three-part exam that doctors must pass before practicing medicine in the United States. In an editorial accompanying the paper, Leo Anthony Celi, a principal research scientist at MIT’s Institute for Medical Engineering and Science, a practicing physician at Beth Israel Deaconess Medical Center, and an associate professor at Harvard Medical School, and his co-authors argue that ChatGPT’s success on this exam should be a wake-up call for the medical community.

    Q: What do you think the success of ChatGPT on the USMLE reveals about the nature of the medical education and evaluation of students? 

    A: The framing of medical knowledge as something that can be encapsulated into multiple choice questions creates a cognitive framing of false certainty. Medical knowledge is often taught as fixed model representations of health and disease. Treatment effects are presented as stable over time despite constantly changing practice patterns. Mechanistic models are passed on from teachers to students with little emphasis on how robustly those models were derived, the uncertainties that persist around them, and how they must be recalibrated to reflect advances worthy of incorporation into practice. 

    ChatGPT passed an examination that rewards memorizing the components of a system rather than analyzing how it works, how it fails, how it was created, how it is maintained. Its success demonstrates some of the shortcomings in how we train and evaluate medical students. Critical thinking requires appreciation that ground truths in medicine continually shift, and more importantly, an understanding how and why they shift.

    Q: What steps do you think the medical community should take to modify how students are taught and evaluated?  

    A: Learning is about leveraging the current body of knowledge, understanding its gaps, and seeking to fill those gaps. It requires being comfortable with and being able to probe the uncertainties. We fail as teachers by not teaching students how to understand the gaps in the current body of knowledge. We fail them when we preach certainty over curiosity, and hubris over humility.  

    Medical education also requires being aware of the biases in the way medical knowledge is created and validated. These biases are best addressed by optimizing the cognitive diversity within the community. More than ever, there is a need to inspire cross-disciplinary collaborative learning and problem-solving. Medical students need data science skills that will allow every clinician to contribute to, continually assess, and recalibrate medical knowledge.

    Q: Do you see any upside to ChatGPT’s success in this exam? Are there beneficial ways that ChatGPT and other forms of AI can contribute to the practice of medicine? 

    A: There is no question that large language models (LLMs) such as ChatGPT are very powerful tools in sifting through content beyond the capabilities of experts, or even groups of experts, and extracting knowledge. However, we will need to address the problem of data bias before we can leverage LLMs and other artificial intelligence technologies. The body of knowledge that LLMs train on, both medical and beyond, is dominated by content and research from well-funded institutions in high-income countries. It is not representative of most of the world.

    We have also learned that even mechanistic models of health and disease may be biased. These inputs are fed to encoders and transformers that are oblivious to these biases. Ground truths in medicine are continuously shifting, and currently, there is no way to determine when ground truths have drifted. LLMs do not evaluate the quality and the bias of the content they are being trained on. Neither do they provide the level of uncertainty around their output. But the perfect should not be the enemy of the good. There is tremendous opportunity to improve the way health care providers currently make clinical decisions, which we know are tainted with unconscious bias. I have no doubt AI will deliver its promise once we have optimized the data input. More