More stories

  • in

    Theoretical breakthrough could boost data storage

    A trio of researchers that includes William Kuszmaul — a computer science PhD student at MIT — has made a discovery that could lead to more efficient data storage and retrieval in computers.

    The team’s findings relate to so-called “linear-probing hash tables,” which were introduced in 1954 and are among the oldest, simplest, and fastest data structures available today. Data structures provide ways of organizing and storing data in computers, with hash tables being one of the most commonly utilized approaches. In a linear-probing hash table, the positions in which information can be stored lie along a linear array.

    Suppose, for instance, that a database is designed to store the Social Security numbers of 10,000 people, Kuszmaul suggests. “We take your Social Security number, x, and we’ll then compute the hash function of x, h(x), which gives you a random number between one and 10,000.” The next step is to take that random number, h(x), go to that position in the array, and put x, the Social Security number, into that spot.

    If there’s already something occupying that spot, Kuszmaul says, “you just move forward to the next free position and put it there. This is where the term ‘linear probing’ comes from, as you keep moving forward linearly until you find an open spot.” In order to later retrieve that Social Security number, x, you just go to the designated spot, h(x), and if it’s not there, you move forward until you either find x or come to a free position and conclude that x is not in your database.

    There’s a somewhat different protocol for deleting an item, such as a Social Security number. If you just left an empty spot in the hash table after deleting the information, that could cause confusion when you later tried to find something else, as the vacant spot might erroneously suggest that the item you’re looking for is nowhere to be found in the database. To avoid that problem, Kuszmaul explains, “you can go to the spot where the element was removed and put a little marker there called a ‘tombstone,’ which indicates there used to be an element here, but it’s gone now.”

    This general procedure has been followed for more than half-a-century. But in all that time, almost everyone using linear-probing hash tables has assumed that if you allow them to get too full, long stretches of occupied spots would run together to form “clusters.” As a result, the time it takes to find a free spot would go up dramatically — quadratically, in fact — taking so long as to be impractical. Consequently, people have been trained to operate hash tables at low capacity — a practice that can exact an economic toll by affecting the amount of hardware a company has to purchase and maintain.

    But this time-honored principle, which has long militated against high load factors, has been totally upended by the work of Kuszmaul and his colleagues, Michael Bender of Stony Brook University and Bradley Kuszmaul of Google. They found that for applications where the number of insertions and deletions stays about the same — and the amount of data added is roughly equal to that removed — linear-probing hash tables can operate at high storage capacities without sacrificing speed.

    In addition, the team has devised a new strategy, called “graveyard hashing,” which involves artificially increasing the number of tombstones placed in an array until they occupy about half the free spots. These tombstones then reserve spaces that can be used for future insertions.

    This approach, which runs contrary to what people have customarily been instructed to do, Kuszmaul says, “can lead to optimal performance in linear-probing hash tables.” Or, as he and his coauthors maintain in their paper, the “well-designed use of tombstones can completely change the … landscape of how linear probing behaves.”

    Kuszmaul wrote up these findings with Bender and Kuszmaul in a paper posted earlier this year that will be presented in February at the Foundations of Computer Science (FOCS) Symposium in Boulder, Colorado.

    Kuszmaul’s PhD thesis advisor, MIT computer science professor Charles E. Leiserson (who did not participate in this research), agrees with that assessment. “These new and surprising results overturn one of the oldest conventional wisdoms about hash table behavior,” Leiserson says. “The lessons will reverberate for years among theoreticians and practitioners alike.”

    As for translating their results into practice, Kuszmaul notes, “there are many considerations that go into building a hash table. Although we’ve advanced the story considerably from a theoretical standpoint, we’re just starting to explore the experimental side of things.” More

  • in

    Studying learner engagement during the Covid-19 pandemic

    While massive open online classes (MOOCs) have been a significant trend in higher education for many years now, they have gained a new level of attention during the Covid-19 pandemic. Open online courses became a critical resource for a wide audience of new learners during the first stages of the pandemic — including students whose academic programs had shifted online, teachers seeking online resources, and individuals suddenly facing lockdown or unemployment and looking to build new skills.

    Mary Ellen Wiltrout, director of online and blended learning initiatives and lecturer in digital learning in the Department of Biology, and Virginia “Katie” Blackwell, currently an MIT PhD student in biology, published a paper this summer in the European MOOC Stakeholder Summit (EMOOCs 2021) conference proceedings evaluating data for the online course 7.00x (Introduction to Biology). Their research objective was to better understand whether the shift to online learning that occurred during the pandemic led to increased learner engagement in the course.Blackwell participated in this research as part of the Bernard S. and Sophie G. Gould MIT Summer Research Program (MSRP) in Biology, during the uniquely remote MSRPx-Biology 2020 student cohort. She collaborated on the project while working toward her bachelor’s degree in biochemistry and molecular biology from the University of Texas at Dallas, and collaborated on the research while in Texas. She has since applied and been accepted into MIT’s PhD program in biology.

    “MSRP Biology was a transformative experience for me. I learned a lot about the nature of research and the MIT community in a very short period of time and loved every second of the program. Without MSRP, I would never have even considered applying to MIT for my PhD. After MSRP and working with Mary Ellen, MIT biology became my first-choice program and I felt like I had a shot at getting in,” says Blackwell.

    Play video

    Many MOOC platforms experienced increased website traffic in 2020, with 30 new MOOC-based degrees and more than 60 million new learners.

    “We find that the tremendous, lifelong learning opportunities that MOOCs provide are even more important and sought-after when traditional education is disrupted. During the pandemic, people had to be at home more often, and some faced unemployment requiring a career transition,” says Wiltrout.

    Wiltrout and Blackwell wanted to build a deeper understanding of learner profiles rather than looking exclusively at enrollments. They looked at all available data, including: enrollment demographics (i.e., country and “.edu” participants); proportion of learners engaged with videos, problems, and forums; number of individual engagement events with videos, problems, and forums; verification and performance; and the course “track” level — including auditing (for free) and verified (paying and receiving access to additional course content, including access to a comprehensive competency exam). They analyzed data in these areas from five runs of 7.00x in this study, including three pre-pandemic runs of April, July, and November 2019 and two pandemic runs of March and July 2020. 

    The March 2020 run had the same count of verified-track participants as all three pre-pandemic runs combined. The July 2020 run enrolled nearly as many verified-track participants as the March 2020 run. Wiltrout says that introductory biology content may have attracted great attention during the early days and months of the Covid-19 pandemic, as people may have had a new (or renewed) interest in learning about (or reviewing) viruses, RNA, the inner workings of cells, and more.

    Wiltrout and Blackwell found that the enrollment count for the March 2020 run of the course increased at almost triple the rate of the three pre-pandemic runs. During the early days of March 2020, the enrollment metrics appeared similar to enrollment metrics for the April 2019 run — both in rate and count — but the enrollment rate increased sharply around March 15, 2020. The July 2020 run began with more than twice as many learners already enrolled by the first day of the course, but continued with half the enrollment rate of the March 2020 course. In terms of learner demographics, during the pandemic, there was a higher proportion of learners with .edu addresses, indicating that MOOCs were often used by students enrolled in other schools. 

    Viewings of course videos increased at the beginning of the pandemic. During the March 2020 run, both verified-track and certified participants viewed far more unique videos during March 2020 than in the pre-pandemic runs of the course; even auditor-track learners — not aiming for certification — still viewed all videos offered. During the July 2020 run, however, both verified-track and certified participants viewed far fewer unique videos than during all prior runs. The proportion of participants who viewed at least one video decreased in the July 2020 run to 53 percent, from a mean of 64 percent in prior runs. Blackwell and Wiltrout say that this decrease — as well as the overall dip in participation in July 2020 — might be attributed to shifting circumstances for learners that allowed for less time to watch videos and participate in the course, as well as some fatigue from the extra screen time.

    The study found that 4.4 percent of March 2020 participants and 4.5 percent of July 2020 participants engaged through forum posting — which was 1.4 to 3.3 times higher than pre-pandemic proportions of forum posting. The increase in forum engagement may point to a desire for community engagement during a time when many were isolated and sheltering in place.

    “Through the day-to-day work of my research team and also through the engagement of the learners in 7.00x, we can see that there is great potential for meaningful connections in remote experiences,” says Wiltrout. “An increase in participation for an online course may not always remain at the same high level, in the long term, but overall, we’re continuing to see an increase in the number of MOOCs and other online programs offered by all universities and institutions, as well as an increase in online learners.” More

  • in

    MIT collaborates with Biogen on three-year, $7 million initiative to address climate, health, and equity

    MIT and Biogen have announced that they will collaborate with the goal to accelerate the science and action on climate change to improve human health. This collaboration is supported by a three-year, $7 million commitment from the company and the Biogen Foundation. The biotechnology company, headquartered in Cambridge, Massachusetts’ Kendall Square, discovers and develops therapies for people living with serious neurological diseases.

    “We have long believed it is imperative for Biogen to make the fight against climate change central to our long-term corporate responsibility commitments. Through this collaboration with MIT, we aim to identify and share innovative climate solutions that will deliver co-benefits for both health and equity,” says Michel Vounatsos, CEO of Biogen. “We are also proud to support the MIT Museum, which promises to make world-class science and education accessible to all, and honor Biogen co-founder Phillip A. Sharp with a dedication inside the museum that recognizes his contributions to its development.”

    Biogen and the Biogen Foundation are supporting research and programs across a range of areas at MIT.

    Advancing climate, health, and equity

    The first such effort involves new work within the MIT Joint Program on the Science and Policy of Global Change to establish a state-of-the-art integrated model of climate and health aimed at identifying targets that deliver climate and health co-benefits.

    “Evidence suggests that not all climate-related actions deliver equal health benefits, yet policymakers, planners, and stakeholders traditionally lack the tools to consider how decisions in one arena impact the other,” says C. Adam Schlosser, deputy director of the MIT Joint Program. “Biogen’s collaboration with the MIT Joint Program — and its support of a new distinguished Biogen Fellow who will develop the new climate/health model — will accelerate our efforts to provide decision-makers with these tools.”

    Biogen is also supporting the MIT Technology and Policy Program’s Research to Policy Engagement Initiative to infuse human health as a key new consideration in decision-making on the best pathways forward to address the global climate crisis, and bridge the knowledge-to-action gap by connecting policymakers, researchers, and diverse stakeholders. As part of this work, Biogen is underwriting a distinguished Biogen Fellow to advance new research on climate, health, and equity.

    “Our work with Biogen has allowed us to make progress on key questions that matter to human health and well-being under climate change,” says Noelle Eckley Selin, who directs the MIT Technology and Policy Program and is a professor in the MIT Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences. “Further, their support of the Research to Policy Engagement Initiative helps all of our research become more effective in making change.”

    In addition, Biogen has joined 13 other companies in the MIT Climate and Sustainability Consortium (MCSC), which is supporting faculty and student research and developing impact pathways that present a range of actionable steps that companies can take — within and across industries — to advance progress toward climate targets.

    “Biogen joining the MIT Climate and Sustainability Consortium represents our commitment to working with member companies across a diverse range of industries, an approach that aims to drive changes swift and broad enough to match the scale of the climate challenge,” says Jeremy Gregory, executive director of the MCSC. “We are excited to welcome a member from the biotechnology space and look forward to harnessing Biogen’s perspectives as we continue to collaborate and work together with the MIT community in exciting and meaningful ways.”

    Making world-class science and education available to MIT Museum visitors

    Support from Biogen will honor Nobel laureate, MIT Institute professor, and Biogen co-founder Phillip A. Sharp with a named space inside the new Kendall Square location of the MIT Museum, set to open in spring 2022. Biogen also is supporting one of the museum’s opening exhibitions, “Essential MIT,” with a section focused on solving real-world problems such as climate change. It is also providing programmatic support for the museum’s Life Sciences Maker Engagement Program.

    “Phil has provided fantastic support to the MIT Museum for more than a decade as an advisory board member and now as board chair, and he has been deeply involved in plans for the new museum at Kendall Square,” says John Durant, the Mark R. Epstein (Class of 1963) Director of the museum. “Seeing his name on the wall will be a constant reminder of his key role in this development, as well as a mark of our gratitude.”

    Inspiring and empowering the next generation of scientists

    Biogen funding is also being directed to engage the next generation of scientists through support for the Biogen-MIT Biotech in Action: Virtual Lab, a program designed to foster a love of science among diverse and under-served student populations.

    Biogen’s support is part of its Healthy Climate, Healthy Lives initiative, a $250 million, 20-year commitment to eliminate fossil fuels across its operations and collaborate with renowned institutions to advance the science of climate and health and support under-served communities. Additional support is provided by the Biogen Foundation to further its long-standing focus on providing students with equitable access to outstanding science education. More

  • in

    Avoiding shortcut solutions in artificial intelligence

    If your Uber driver takes a shortcut, you might get to your destination faster. But if a machine learning model takes a shortcut, it might fail in unexpected ways.

    In machine learning, a shortcut solution occurs when the model relies on a simple characteristic of a dataset to make a decision, rather than learning the true essence of the data, which can lead to inaccurate predictions. For example, a model might learn to identify images of cows by focusing on the green grass that appears in the photos, rather than the more complex shapes and patterns of the cows.  

    A new study by researchers at MIT explores the problem of shortcuts in a popular machine-learning method and proposes a solution that can prevent shortcuts by forcing the model to use more data in its decision-making.

    By removing the simpler characteristics the model is focusing on, the researchers force it to focus on more complex features of the data that it hadn’t been considering. Then, by asking the model to solve the same task two ways — once using those simpler features, and then also using the complex features it has now learned to identify — they reduce the tendency for shortcut solutions and boost the performance of the model.

    One potential application of this work is to enhance the effectiveness of machine learning models that are used to identify disease in medical images. Shortcut solutions in this context could lead to false diagnoses and have dangerous implications for patients.

    “It is still difficult to tell why deep networks make the decisions that they do, and in particular, which parts of the data these networks choose to focus upon when making a decision. If we can understand how shortcuts work in further detail, we can go even farther to answer some of the fundamental but very practical questions that are really important to people who are trying to deploy these networks,” says Joshua Robinson, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper.

    Robinson wrote the paper with his advisors, senior author Suvrit Sra, the Esther and Harold E. Edgerton Career Development Associate Professor in the Department of Electrical Engineering and Computer Science (EECS) and a core member of the Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems; and Stefanie Jegelka, the X-Consortium Career Development Associate Professor in EECS and a member of CSAIL and IDSS; as well as University of Pittsburgh assistant professor Kayhan Batmanghelich and PhD students Li Sun and Ke Yu. The research will be presented at the Conference on Neural Information Processing Systems in December. 

    The long road to understanding shortcuts

    The researchers focused their study on contrastive learning, which is a powerful form of self-supervised machine learning. In self-supervised machine learning, a model is trained using raw data that do not have label descriptions from humans. It can therefore be used successfully for a larger variety of data.

    A self-supervised learning model learns useful representations of data, which are used as inputs for different tasks, like image classification. But if the model takes shortcuts and fails to capture important information, these tasks won’t be able to use that information either.

    For example, if a self-supervised learning model is trained to classify pneumonia in X-rays from a number of hospitals, but it learns to make predictions based on a tag that identifies the hospital the scan came from (because some hospitals have more pneumonia cases than others), the model won’t perform well when it is given data from a new hospital.     

    For contrastive learning models, an encoder algorithm is trained to discriminate between pairs of similar inputs and pairs of dissimilar inputs. This process encodes rich and complex data, like images, in a way that the contrastive learning model can interpret.

    The researchers tested contrastive learning encoders with a series of images and found that, during this training procedure, they also fall prey to shortcut solutions. The encoders tend to focus on the simplest features of an image to decide which pairs of inputs are similar and which are dissimilar. Ideally, the encoder should focus on all the useful characteristics of the data when making a decision, Jegelka says.

    So, the team made it harder to tell the difference between the similar and dissimilar pairs, and found that this changes which features the encoder will look at to make a decision.

    “If you make the task of discriminating between similar and dissimilar items harder and harder, then your system is forced to learn more meaningful information in the data, because without learning that it cannot solve the task,” she says.

    But increasing this difficulty resulted in a tradeoff — the encoder got better at focusing on some features of the data but became worse at focusing on others. It almost seemed to forget the simpler features, Robinson says.

    To avoid this tradeoff, the researchers asked the encoder to discriminate between the pairs the same way it had originally, using the simpler features, and also after the researchers removed the information it had already learned. Solving the task both ways simultaneously caused the encoder to improve across all features.

    Their method, called implicit feature modification, adaptively modifies samples to remove the simpler features the encoder is using to discriminate between the pairs. The technique does not rely on human input, which is important because real-world data sets can have hundreds of different features that could combine in complex ways, Sra explains.

    From cars to COPD

    The researchers ran one test of this method using images of vehicles. They used implicit feature modification to adjust the color, orientation, and vehicle type to make it harder for the encoder to discriminate between similar and dissimilar pairs of images. The encoder improved its accuracy across all three features — texture, shape, and color — simultaneously.

    To see if the method would stand up to more complex data, the researchers also tested it with samples from a medical image database of chronic obstructive pulmonary disease (COPD). Again, the method led to simultaneous improvements across all features they evaluated.

    While this work takes some important steps forward in understanding the causes of shortcut solutions and working to solve them, the researchers say that continuing to refine these methods and applying them to other types of self-supervised learning will be key to future advancements.

    “This ties into some of the biggest questions about deep learning systems, like ‘Why do they fail?’ and ‘Can we know in advance the situations where your model will fail?’ There is still a lot farther to go if you want to understand shortcut learning in its full generality,” Robinson says.

    This research is supported by the National Science Foundation, National Institutes of Health, and the Pennsylvania Department of Health’s SAP SE Commonwealth Universal Research Enhancement (CURE) program. More

  • in

    Differences in T cells’ functional states determine resistance to cancer therapy

    Non-small cell lung cancer (NSCLC) is the most common type of lung cancer in humans. Some patients with NSCLC receive a therapy called immune checkpoint blockade (ICB) that helps kill cancer cells by reinvigorating a subset of immune cells called T cells, which are “exhausted” and have stopped working. However, only about 35 percent of NSCLC patients respond to ICB therapy. Stefani Spranger’s lab at the MIT Department of Biology explores the mechanisms behind this resistance, with the goal of inspiring new therapies to better treat NSCLC patients. In a new study published on Oct. 29 in Science Immunology, a team led by Spranger lab postdoc Brendan Horton revealed what causes T cells to be non-responsive to ICB — and suggests a possible solution.

    Scientists have long thought that the conditions within a tumor were responsible for determining when T cells stop working and become exhausted after being overstimulated or working for too long to fight a tumor. That’s why physicians prescribe ICB to treat cancer — ICB can invigorate the exhausted T cells within a tumor. However, Horton’s new experiments show that some ICB-resistant T cells stop working before they even enter the tumor. These T cells are not actually exhausted, but rather they become dysfunctional due to changes in gene expression that arise early during the activation of a T cell, which occurs in lymph nodes. Once activated, T cells differentiate into certain functional states, which are distinguishable by their unique gene expression patterns.

    The notion that the dysfunctional state that leads to ICB resistance arises before T cells enter the tumor is quite novel, says Spranger, the Howard S. and Linda B. Stern Career Development Professor, a member of the Koch Institute for Integrative Cancer Research, and the study’s senior author.

    “We show that this state is actually a preset condition, and that the T cells are already non-responsive to therapy before they enter the tumor,” she says. As a result, she explains, ICB therapies that work by reinvigorating exhausted T cells within the tumor are less likely to be effective. This suggests that combining ICB with other forms of immunotherapy that target T cells differently might be a more effective approach to help the immune system combat this subset of lung cancer.

    In order to determine why some tumors are resistant to ICB, Horton and the research team studied T cells in murine models of NSCLC. The researchers sequenced messenger RNA from the responsive and non-responsive T cells in order to identify any differences between the T cells. Supported in part by the Koch Institute Frontier Research Program, they used a technique called Seq-Well, developed in the lab of fellow Koch Institute member J. Christopher Love, the Raymond A. (1921) and Helen E. St. Laurent Professor of Chemical Engineering and a co-author of the study. The technique allows for the rapid gene expression profiling of single cells, which permitted Spranger and Horton to get a very granular look at the gene expression patterns of the T cells they were studying.

    Seq-Well revealed distinct patterns of gene expression between the responsive and non-responsive T cells. These differences, which are determined when the T cells assume their specialized functional states, may be the underlying cause of ICB resistance.

    Now that Horton and his colleagues had a possible explanation for why some T cells did not respond to ICB, they decided to see if they could help the ICB-resistant T cells kill the tumor cells. When analyzing the gene expression patterns of the non-responsive T cells, the researchers had noticed that these T cells had a lower expression of receptors for certain cytokines, small proteins that control immune system activity. To counteract this, the researchers treated lung tumors in murine models with extra cytokines. As a result, the previously non-responsive T cells were then able to fight the tumors — meaning that the cytokine therapy prevented, and potentially even reversed, the dysfunctionality.

    Administering cytokine therapy to human patients is not currently safe, because cytokines can cause serious side effects as well as a reaction called a “cytokine storm,” which can produce severe fevers, inflammation, fatigue, and nausea. However, there are ongoing efforts to figure out how to safely administer cytokines to specific tumors. In the future, Spranger and Horton suspect that cytokine therapy could be used in combination with ICB.

    “This is potentially something that could be translated into a therapeutic that could increase the therapy response rate in non-small cell lung cancer,” Horton says.

    Spranger agrees that this work will help researchers develop more innovative cancer therapies, especially because researchers have historically focused on T cell exhaustion rather than the earlier role that T cell functional states might play in cancer.

    “If T cells are rendered dysfunctional early on, ICB is not going to be effective, and we need to think outside the box,” she says. “There’s more evidence, and other labs are now showing this as well, that the functional state of the T cell actually matters quite substantially in cancer therapies.” To Spranger, this means that cytokine therapy “might be a therapeutic avenue” for NSCLC patients beyond ICB.

    Jeffrey Bluestone, the A.W. and Mary Margaret Clausen Distinguished Professor of Metabolism and Endocrinology at the University of California-San Francisco, who was not involved with the paper, agrees. “The study provides a potential opportunity to ‘rescue’ immunity in the NSCLC non-responder patients with appropriate combination therapies,” he says.

    This research was funded by the Pew-Stewart Scholars for Cancer Research, the Ludwig Center for Molecular Oncology, the Koch Institute Frontier Research Program through the Kathy and Curt Mable Cancer Research Fund, and the National Cancer Institute. More

  • in

    Taming the data deluge

    An oncoming tsunami of data threatens to overwhelm huge data-rich research projects on such areas that range from the tiny neutrino to an exploding supernova, as well as the mysteries deep within the brain. 

    When LIGO picks up a gravitational-wave signal from a distant collision of black holes and neutron stars, a clock starts ticking for capturing the earliest possible light that may accompany them: time is of the essence in this race. Data collected from electrical sensors monitoring brain activity are outpacing computing capacity. Information from the Large Hadron Collider (LHC)’s smashed particle beams will soon exceed 1 petabit per second. 

    To tackle this approaching data bottleneck in real-time, a team of researchers from nine institutions led by the University of Washington, including MIT, has received $15 million in funding to establish the Accelerated AI Algorithms for Data-Driven Discovery (A3D3) Institute. From MIT, the research team includes Philip Harris, assistant professor of physics, who will serve as the deputy director of the A3D3 Institute; Song Han, assistant professor of electrical engineering and computer science, who will serve as the A3D3’s co-PI; and Erik Katsavounidis, senior research scientist with the MIT Kavli Institute for Astrophysics and Space Research.

    Infused with this five-year Harnessing the Data Revolution Big Idea grant, and jointly funded by the Office of Advanced Cyberinfrastructure, A3D3 will focus on three data-rich fields: multi-messenger astrophysics, high-energy particle physics, and brain imaging neuroscience. By enriching AI algorithms with new processors, A3D3 seeks to speed up AI algorithms for solving fundamental problems in collider physics, neutrino physics, astronomy, gravitational-wave physics, computer science, and neuroscience. 

    “I am very excited about the new Institute’s opportunities for research in nuclear and particle physics,” says Laboratory for Nuclear Science Director Boleslaw Wyslouch. “Modern particle detectors produce an enormous amount of data, and we are looking for extraordinarily rare signatures. The application of extremely fast processors to sift through these mountains of data will make a huge difference in what we will measure and discover.”

    The seeds of A3D3 were planted in 2017, when Harris and his colleagues at Fermilab and CERN decided to integrate real-time AI algorithms to process the incredible rates of data at the LHC. Through email correspondence with Han, Harris’ team built a compiler, HLS4ML, that could run an AI algorithm in nanoseconds.

    “Before the development of HLS4ML, the fastest processing that we knew of was roughly a millisecond per AI inference, maybe a little faster,” says Harris. “We realized all the AI algorithms were designed to solve much slower problems, such as image and voice recognition. To get to nanosecond inference timescales, we recognized we could make smaller algorithms and rely on custom implementations with Field Programmable Gate Array (FPGA) processors in an approach that was largely different from what others were doing.”

    A few months later, Harris presented their research at a physics faculty meeting, where Katsavounidis became intrigued. Over coffee in Building 7, they discussed combining Harris’ FPGA with Katsavounidis’s use of machine learning for finding gravitational waves. FPGAs and other new processor types, such as graphics processing units (GPUs), accelerate AI algorithms to more quickly analyze huge amounts of data.

    “I had worked with the first FPGAs that were out in the market in the early ’90s and have witnessed first-hand how they revolutionized front-end electronics and data acquisition in big high-energy physics experiments I was working on back then,” recalls Katsavounidis. “The ability to have them crunch gravitational-wave data has been in the back of my mind since joining LIGO over 20 years ago.”

    Two years ago they received their first grant, and the University of Washington’s Shih-Chieh Hsu joined in. The team initiated the Fast Machine Lab, published about 40 papers on the subject, built the group to about 50 researchers, and “launched a whole industry of how to explore a region of AI that has not been explored in the past,” says Harris. “We basically started this without any funding. We’ve been getting small grants for various projects over the years. A3D3 represents our first large grant to support this effort.”  

    “What makes A3D3 so special and suited to MIT is its exploration of a technical frontier, where AI is implemented not in high-level software, but rather in lower-level firmware, reconfiguring individual gates to address the scientific question at hand,” says Rob Simcoe, director of MIT Kavli Institute for Astrophysics and Space Research and the Francis Friedman Professor of Physics. “We are in an era where experiments generate torrents of data. The acceleration gained from tailoring reprogrammable, bespoke computers at the processor level can advance real-time analysis of these data to new levels of speed and sophistication.”

    The Huge Data from the Large Hadron Collider 

    With data rates already exceeding 500 terabits per second, the LHC processes more data than any other scientific instrument on earth. Its future aggregate data rates will soon exceed 1 petabit per second, the biggest data rate in the world. 

    “Through the use of AI, A3D3 aims to perform advanced analyses, such as anomaly detection, and particle reconstruction on all collisions happening 40 million times per second,” says Harris.

    The goal is to find within all of this data a way to identify the few collisions out of the 3.2 billion collisions per second that could reveal new forces, explain how dark matter is formed, and complete the picture of how fundamental forces interact with matter. Processing all of this information requires a customized computing system capable of interpreting the collider information within ultra-low latencies.  

    “The challenge of running this on all of the 100s of terabits per second in real-time is daunting and requires a complete overhaul of how we design and implement AI algorithms,” says Harris. “With large increases in the detector resolution leading to data rates that are even larger the challenge of finding the one collision, among many, will become even more daunting.” 

    The Brain and the Universe

    Thanks to advances in techniques such as medical imaging and electrical recordings from implanted electrodes, neuroscience is also gathering larger amounts of data on how the brain’s neural networks process responses to stimuli and perform motor information. A3D3 plans to develop and implement high-throughput and low-latency AI algorithms to process, organize, and analyze massive neural datasets in real time, to probe brain function in order to enable new experiments and therapies.   

    With Multi-Messenger Astrophysics (MMA), A3D3 aims to quickly identify astronomical events by efficiently processing data from gravitational waves, gamma-ray bursts, and neutrinos picked up by telescopes and detectors. 

    The A3D3 researchers also include a multi-disciplinary group of 15 other researchers, including project lead the University of Washington, along with Caltech, Duke University, Purdue University, UC San Diego, University of Illinois Urbana-Champaign, University of Minnesota, and the University of Wisconsin-Madison. It will include neutrinos research at Icecube and DUNE, and visible astronomy at Zwicky Transient Facility, and will organize deep-learning workshops and boot camps to train students and researchers on how to contribute to the framework and widen the use of fast AI strategies.

    “We have reached a point where detector network growth will be transformative, both in terms of event rates and in terms of astrophysical reach and ultimately, discoveries,” says Katsavounidis. “‘Fast’ and ‘efficient’ is the only way to fight the ‘faint’ and ‘fuzzy’ that is out there in the universe, and the path for getting the most out of our detectors. A3D3 on one hand is going to bring production-scale AI to gravitational-wave physics and multi-messenger astronomy; but on the other hand, we aspire to go beyond our immediate domains and become the go-to place across the country for applications of accelerated AI to data-driven disciplines.” More

  • in

    Exploring the human stories behind the data

    Shaking in the back of a police cruiser, handcuffs digging into his wrists, Brian Williams was overwhelmed with fear. He had been pulled over, but before he was asked for his name, license, or registration, a police officer ordered him out of his car and into back of the police cruiser, saying into his radio, “Black male detained.” The officer’s explanation for these actions was: “for your safety and mine.”

    Williams walked away from the experience with two tickets, a pair of bruised wrists, and a desire to do everything in his power to prevent others from experiencing the utter powerlessness he had felt.

    Now an MIT senior majoring in biological engineering and minoring in Black studies, Williams has continued working to empower his community. Through experiences in and out of the classroom, he has leveraged his background in bioengineering to explore interests in public health and social justice, specifically looking at how the medical sector can uplift and support communities of color.

    Williams grew up in a close-knit family and community in Broward County, Florida, where he found comfort in the routine of Sunday church services, playing outside with friends, and cookouts on the weekends. Broward County was home to him — a home he felt deeply invested in and indebted to.

    “It takes a village. The Black community has invested a lot in me, and I have a lot to invest back in it,” he says.

    Williams initially focused on STEM subjects at MIT, but in his sophomore year, his interests in exploring data science and humanities research led him to an Undergraduate Research Opportunities Program (UROP) project in the Department of Political Science. Working with Professor Ariel White, he analyzed information on incarceration and voting rights, studied the behavior patterns of police officers, and screened 911 calls to identify correlations between how people described events to how the police responded to them.

    In the summer before his junior year, Williams also joined MIT’s Civic Data Design Lab, where he worked as a researcher for the Missing Data Project, which uses both journalism and data science to visualize statistics and humanize the people behind the numbers. As the project’s name suggests, there is often much to be learned from seeking out data that aren’t easily available. Using datasets and interviews describing how the pandemic affected Black communities, Williams and a team of researchers created a series called the Color of Covid, which told the stories behind the grim statistics on race and the pandemic.

    The following year, Williams undertook a research-and-development internship with the biopharmaceutical company Amgen in San Francisco, working on protein engineering to combat autoimmune diseases. Because this work was primarily in the lab, focusing on science-based applications, he saw it as an opportunity to ask himself: “Do I want to dedicate my life to this area of bioengineering?” He found the issue of social justice to be more compelling.

    At the same time, Williams was drawn toward tackling problems the local Black community was experiencing related to the pandemic. He found himself thinking deeply about how to educate the public, address disparities in case rates, and, above all, help people.

    Working through Amgen’s Black Employee Resource Group and its Diversity, Inclusion, and Belonging Team, Williams crafted a proposal, which the company adopted, for addressing Covid-19 vaccination misinformation in Black and Brown communities in San Mateo and San Francisco County. He paid special attention to how to frame vaccine hesitancy among members of these communities, understanding that a longstanding history of racism in scientific discovery and medicine led many Black and Brown people to distrust the entire medical industry.

    “Trying to meet people where they are is important,” Williams says.

    This experience reinforced the idea for Williams that he wanted to do everything in his power to uplift the Black community.

    “I think it’s only right that I go out and I shine bright because it’s not easy being Black. You know, you have to work twice as hard to get half as much,” he says.

    As the current political action co-chair of the MIT Black Students’ Union (BSU), Williams also works to inspire change on campus, promoting and participating in events that uplift the BSU. During his Amgen internship, he also organized the MIT Black History Month Takeover Series, which involved organizing eight events from February through the beginning of spring semester. These included promotions through social media and virtual meetings for students and faculty. For his leadership, he received the “We Are Family” award from the BSU executive board.

    Williams prioritizes community in everything he does, whether in the classroom, at a campus event, or spending time outside in local communities of color around Boston.

    “The things that really keep me going are the stories of other people,” says Williams, who is currently applying to a variety of postgraduate programs. After receiving MIT endorsement, he applied to the Rhodes and Marshall Fellowships; he also plans to apply to law school with a joint master’s degree in public health and policy.

    Ultimately, Williams hopes to bring his fight for racial justice to the policy level, looking at how a long, ongoing history of medical racism has led marginalized communities to mistrust current scientific endeavors. He wants to help bring about new legislation to fix old systems which disproportionately harm communities of color. He says he aims to be “an engineer of social solutions, one who reaches deep into their toolbox of social justice, pulling the levers of activism, advocacy, democracy, and legislation to radically change our world — to improve our social institutions at the root and liberate our communities.” While he understands this is a big feat, he sees his ambition as an asset.

    “I’m just another person with huge aspirations, and an understanding that you have to go get it if you want it,” he says. “You feel me? At the end of the day, this is just the beginning of my story. And I’m grateful to everyone in my life that’s helping me write it. Tap in.” More

  • in

    Making machine learning more useful to high-stakes decision makers

    The U.S. Centers for Disease Control and Prevention estimates that one in seven children in the United States experienced abuse or neglect in the past year. Child protective services agencies around the nation receive a high number of reports each year (about 4.4 million in 2019) of alleged neglect or abuse. With so many cases, some agencies are implementing machine learning models to help child welfare specialists screen cases and determine which to recommend for further investigation.

    But these models don’t do any good if the humans they are intended to help don’t understand or trust their outputs.

    Researchers at MIT and elsewhere launched a research project to identify and tackle machine learning usability challenges in child welfare screening. In collaboration with a child welfare department in Colorado, the researchers studied how call screeners assess cases, with and without the help of machine learning predictions. Based on feedback from the call screeners, they designed a visual analytics tool that uses bar graphs to show how specific factors of a case contribute to the predicted risk that a child will be removed from their home within two years.

    The researchers found that screeners are more interested in seeing how each factor, like the child’s age, influences a prediction, rather than understanding the computational basis of how the model works. Their results also show that even a simple model can cause confusion if its features are not described with straightforward language.

    These findings could be applied to other high-risk fields where humans use machine learning models to help them make decisions, but lack data science experience, says senior author Kalyan Veeramachaneni, principal research scientist in the Laboratory for Information and Decision Systems (LIDS) and senior author of the paper.

    “Researchers who study explainable AI, they often try to dig deeper into the model itself to explain what the model did. But a big takeaway from this project is that these domain experts don’t necessarily want to learn what machine learning actually does. They are more interested in understanding why the model is making a different prediction than what their intuition is saying, or what factors it is using to make this prediction. They want information that helps them reconcile their agreements or disagreements with the model, or confirms their intuition,” he says.

    Co-authors include electrical engineering and computer science PhD student Alexandra Zytek, who is the lead author; postdoc Dongyu Liu; and Rhema Vaithianathan, professor of economics and director of the Center for Social Data Analytics at the Auckland University of Technology and professor of social data analytics at the University of Queensland. The research will be presented later this month at the IEEE Visualization Conference.

    Real-world research

    The researchers began the study more than two years ago by identifying seven factors that make a machine learning model less usable, including lack of trust in where predictions come from and disagreements between user opinions and the model’s output.

    With these factors in mind, Zytek and Liu flew to Colorado in the winter of 2019 to learn firsthand from call screeners in a child welfare department. This department is implementing a machine learning system developed by Vaithianathan that generates a risk score for each report, predicting the likelihood the child will be removed from their home. That risk score is based on more than 100 demographic and historic factors, such as the parents’ ages and past court involvements.

    “As you can imagine, just getting a number between one and 20 and being told to integrate this into your workflow can be a bit challenging,” Zytek says.

    They observed how teams of screeners process cases in about 10 minutes and spend most of that time discussing the risk factors associated with the case. That inspired the researchers to develop a case-specific details interface, which shows how each factor influenced the overall risk score using color-coded, horizontal bar graphs that indicate the magnitude of the contribution in a positive or negative direction.

    Based on observations and detailed interviews, the researchers built four additional interfaces that provide explanations of the model, including one that compares a current case to past cases with similar risk scores. Then they ran a series of user studies.

    The studies revealed that more than 90 percent of the screeners found the case-specific details interface to be useful, and it generally increased their trust in the model’s predictions. On the other hand, the screeners did not like the case comparison interface. While the researchers thought this interface would increase trust in the model, screeners were concerned it could lead to decisions based on past cases rather than the current report.   

    “The most interesting result to me was that, the features we showed them — the information that the model uses — had to be really interpretable to start. The model uses more than 100 different features in order to make its prediction, and a lot of those were a bit confusing,” Zytek says.

    Keeping the screeners in the loop throughout the iterative process helped the researchers make decisions about what elements to include in the machine learning explanation tool, called Sibyl.

    As they refined the Sibyl interfaces, the researchers were careful to consider how providing explanations could contribute to some cognitive biases, and even undermine screeners’ trust in the model.

    For instance, since explanations are based on averages in a database of child abuse and neglect cases, having three past abuse referrals may actually decrease the risk score of a child, since averages in this database may be far higher. A screener may see that explanation and decide not to trust the model, even though it is working correctly, Zytek explains. And because humans tend to put more emphasis on recent information, the order in which the factors are listed could also influence decisions.

    Improving interpretability

    Based on feedback from call screeners, the researchers are working to tweak the explanation model so the features that it uses are easier to explain.

    Moving forward, they plan to enhance the interfaces they’ve created based on additional feedback and then run a quantitative user study to track the effects on decision making with real cases. Once those evaluations are complete, they can prepare to deploy Sibyl, Zytek says.

    “It was especially valuable to be able to work so actively with these screeners. We got to really understand the problems they faced. While we saw some reservations on their part, what we saw more of was excitement about how useful these explanations were in certain cases. That was really rewarding,” she says.

    This work is supported, in part, by the National Science Foundation. More