More stories

  • in

    New techniques efficiently accelerate sparse tensors for massive AI models

    Researchers from MIT and NVIDIA have developed two techniques that accelerate the processing of sparse tensors, a type of data structure that’s used for high-performance computing tasks. The complementary techniques could result in significant improvements to the performance and energy-efficiency of systems like the massive machine-learning models that drive generative artificial intelligence.

    Tensors are data structures used by machine-learning models. Both of the new methods seek to efficiently exploit what’s known as sparsity — zero values — in the tensors. When processing these tensors, one can skip over the zeros and save on both computation and memory. For instance, anything multiplied by zero is zero, so it can skip that operation. And it can compress the tensor (zeros don’t need to be stored) so a larger portion can be stored in on-chip memory.

    However, there are several challenges to exploiting sparsity. Finding the nonzero values in a large tensor is no easy task. Existing approaches often limit the locations of nonzero values by enforcing a sparsity pattern to simplify the search, but this limits the variety of sparse tensors that can be processed efficiently.

    Another challenge is that the number of nonzero values can vary in different regions of the tensor. This makes it difficult to determine how much space is required to store different regions in memory. To make sure the region fits, more space is often allocated than is needed, causing the storage buffer to be underutilized. This increases off-chip memory traffic, which increases energy consumption.

    The MIT and NVIDIA researchers crafted two solutions to address these problems. For one, they developed a technique that allows the hardware to efficiently find the nonzero values for a wider variety of sparsity patterns.

    For the other solution, they created a method that can handle the case where the data do not fit in memory, which increases the utilization of the storage buffer and reduces off-chip memory traffic.

    Both methods boost the performance and reduce the energy demands of hardware accelerators specifically designed to speed up the processing of sparse tensors.

    “Typically, when you use more specialized or domain-specific hardware accelerators, you lose the flexibility that you would get from a more general-purpose processor, like a CPU. What stands out with these two works is that we show that you can still maintain flexibility and adaptability while being specialized and efficient,” says Vivienne Sze, associate professor in the MIT Department of Electrical Engineering and Computer Science (EECS), a member of the Research Laboratory of Electronics (RLE), and co-senior author of papers on both advances.

    Her co-authors include lead authors Yannan Nellie Wu PhD ’23 and Zi Yu Xue, an electrical engineering and computer science graduate student; and co-senior author Joel Emer, an MIT professor of the practice in computer science and electrical engineering and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), as well as others at NVIDIA. Both papers will be presented at the IEEE/ACM International Symposium on Microarchitecture.

    HighLight: Efficiently finding zero values

    Sparsity can arise in the tensor for a variety of reasons. For example, researchers sometimes “prune” unnecessary pieces of the machine-learning models by replacing some values in the tensor with zeros, creating sparsity. The degree of sparsity (percentage of zeros) and the locations of the zeros can vary for different models.

    To make it easier to find the remaining nonzero values in a model with billions of individual values, researchers often restrict the location of the nonzero values so they fall into a certain pattern. However, each hardware accelerator is typically designed to support one specific sparsity pattern, limiting its flexibility.  

    By contrast, the hardware accelerator the MIT researchers designed, called HighLight, can handle a wide variety of sparsity patterns and still perform well when running models that don’t have any zero values.

    They use a technique they call “hierarchical structured sparsity” to efficiently represent a wide variety of sparsity patterns that are composed of several simple sparsity patterns. This approach divides the values in a tensor into smaller blocks, where each block has its own simple, sparsity pattern (perhaps two zeros and two nonzeros in a block with four values).

    Then, they combine the blocks into a hierarchy, where each collection of blocks also has its own simple, sparsity pattern (perhaps one zero block and three nonzero blocks in a level with four blocks). They continue combining blocks into larger levels, but the patterns remain simple at each step.

    This simplicity enables HighLight to more efficiently find and skip zeros, so it can take full advantage of the opportunity to cut excess computation. On average, their accelerator design had about six times better energy-delay product (a metric related to energy efficiency) than other approaches.

    “In the end, the HighLight accelerator is able to efficiently accelerate dense models because it does not introduce a lot of overhead, and at the same time it is able to exploit workloads with different amounts of zero values based on hierarchical structured sparsity,” Wu explains.

    In the future, she and her collaborators want to apply hierarchical structured sparsity to more types of machine-learning models and different types of tensors in the models.

    Tailors and Swiftiles: Effectively “overbooking” to accelerate workloads

    Researchers can also leverage sparsity to more efficiently move and process data on a computer chip.

    Since the tensors are often larger than what can be stored in the memory buffer on chip, the chip only grabs and processes a chunk of the tensor at a time. The chunks are called tiles.

    To maximize the utilization of that buffer and limit the number of times the chip must access off-chip memory, which often dominates energy consumption and limits processing speed, researchers seek to use the largest tile that will fit into the buffer.

    But in a sparse tensor, many of the data values are zero, so an even larger tile can fit into the buffer than one might expect based on its capacity. Zero values don’t need to be stored.

    But the number of zero values can vary across different regions of the tensor, so they can also vary for each tile. This makes it difficult to determine a tile size that will fit in the buffer. As a result, existing approaches often conservatively assume there are no zeros and end up selecting a smaller tile, which results in wasted blank spaces in the buffer.

    To address this uncertainty, the researchers propose the use of “overbooking” to allow them to increase the tile size, as well as a way to tolerate it if the tile doesn’t fit the buffer.

    The same way an airline overbooks tickets for a flight, if all the passengers show up, the airline must compensate the ones who are bumped from the plane. But usually all the passengers don’t show up.

    In a sparse tensor, a tile size can be chosen such that usually the tiles will have enough zeros that most still fit into the buffer. But occasionally, a tile will have more nonzero values than will fit. In this case, those data are bumped out of the buffer.

    The researchers enable the hardware to only re-fetch the bumped data without grabbing and processing the entire tile again. They modify the “tail end” of the buffer to handle this, hence the name of this technique, Tailors.

    Then they also created an approach for finding the size for tiles that takes advantage of overbooking. This method, called Swiftiles, swiftly estimates the ideal tile size so that a specific percentage of tiles, set by the user, are overbooked. (The names “Tailors” and “Swiftiles” pay homage to Taylor Swift, whose recent Eras tour was fraught with overbooked presale codes for tickets).

    Swiftiles reduces the number of times the hardware needs to check the tensor to identify an ideal tile size, saving on computation. The combination of Tailors and Swiftiles more than doubles the speed while requiring only half the energy demands of existing hardware accelerators which cannot handle overbooking.

    “Swiftiles allows us to estimate how large these tiles need to be without requiring multiple iterations to refine the estimate. This only works because overbooking is supported. Even if you are off by a decent amount, you can still extract a fair bit of speedup because of the way the non-zeros are distributed,” Xue says.

    In the future, the researchers want to apply the idea of overbooking to other aspects in computer architecture and also work to improve the process for estimating the optimal level of overbooking.

    This research is funded, in part, by the MIT AI Hardware Program. More

  • in

    Making genetic prediction models more inclusive

    While any two human genomes are about 99.9 percent identical, genetic variation in the remaining 0.1 percent plays an important role in shaping human diversity, including a person’s risk for developing certain diseases.

    Measuring the cumulative effect of these small genetic differences can provide an estimate of an individual’s genetic risk for a particular disease or their likelihood of having a particular trait. However, the majority of models used to generate these “polygenic scores” are based on studies done in people of European descent, and do not accurately gauge the risk for people of non-European ancestry or people whose genomes contain a mixture of chromosome regions inherited from previously isolated populations, also known as admixed ancestry.

    In an effort to make these genetic scores more inclusive, MIT researchers have created a new model that takes into account genetic information from people from a wider diversity of genetic ancestries across the world. Using this model, they showed that they could increase the accuracy of genetics-based predictions for a variety of traits, especially for people from populations that have been traditionally underrepresented in genetic studies.

    “For people of African ancestry, our model proved to be about 60 percent more accurate on average,” says Manolis Kellis, a professor of computer science in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and a member of the Broad Institute of MIT and Harvard. “For people of admixed genetic backgrounds more broadly, who have been excluded from most previous models, the accuracy of our model increased by an average of about 18 percent.”

    The researchers hope their more inclusive modeling approach could help improve health outcomes for a wider range of people and promote health equity by spreading the benefits of genomic sequencing more widely across the globe.

    “What we have done is created a method that allows you to be much more accurate for admixed and ancestry-diverse individuals, and ensure the results and the benefits of human genetics research are equally shared by everyone,” says MIT postdoc Yosuke Tanigawa, the lead and co-corresponding author of the paper, which appears today in open-access form in the American Journal of Human Genetics. The researchers have made all of their data publicly available for the broader scientific community to use.

    More inclusive models

    The work builds on the Human Genome Project, which mapped all of the genes found in the human genome, and on subsequent large-scale, cohort-based studies of how genetic variants in the human genome are linked to disease risk and other differences between individuals.

    These studies showed that the effect of any individual genetic variant on its own is typically very small. Together, these small effects add up and influence the risk of developing heart disease or diabetes, having a stroke, or being diagnosed with psychiatric disorders such as schizophrenia.

    “We have hundreds of thousands of genetic variants that are associated with complex traits, each of which is individually playing a weak effect, but together they are beginning to be predictive for disease predispositions,” Kellis says.

    However, most of these genome-wide association studies included few people of non-European descent, so polygenic risk models based on them translate poorly to non-European populations. People from different geographic areas can have different patterns of genetic variation, shaped by stochastic drift, population history, and environmental factors — for example, in people of African descent, genetic variants that protect against malaria are more common than in other populations. Those variants also affect other traits involving the immune system, such as counts of neutrophils, a type of immune cell. That variation would not be well-captured in a model based on genetic analysis of people of European ancestry alone.

    “If you are an individual of African descent, of Latin American descent, of Asian descent, then you are currently being left out by the system,” Kellis says. “This inequity in the utilization of genetic information for predicting risk of patients can cause unnecessary burden, unnecessary deaths, and unnecessary lack of prevention, and that’s where our work comes in.”

    Some researchers have begun trying to address these disparities by creating distinct models for people of European descent, of African descent, or of Asian descent. These emerging approaches assign individuals to distinct genetic ancestry groups, aggregate the data to create an association summary, and make genetic prediction models. However, these approaches still don’t represent people of admixed genetic backgrounds well.

    “Our approach builds on the previous work without requiring researchers to assign individuals or local genomic segments of individuals to predefined distinct genetic ancestry groups,” Tanigawa says. “Instead, we develop a single model for everybody by directly working on individuals across the continuum of their genetic ancestries.”

    In creating their new model, the MIT team used computational and statistical techniques that enabled them to study each individual’s unique genetic profile instead of grouping individuals by population. This methodological advancement allowed the researchers to include people of admixed ancestry, who made up nearly 10 percent of the UK Biobank dataset used for this study and currently account for about one in seven newborns in the United States.

    “Because we work at the individual level, there is no need for computing summary-level data for different populations,” Kellis says. “Thus, we did not need to exclude individuals of admixed ancestry, increasing our power by including more individuals and representing contributions from all populations in our combined model.”

    Better predictions

    To create their new model, the researchers used genetic data from more than 280,000 people, which was collected by UK Biobank, a large-scale biomedical database and research resource containing de-identified genetic, lifestyle, and health information from half a million U.K. participants. Using another set of about 81,000 held-out individuals from the UK Biobank, the researchers evaluated their model across 60 traits, which included traits related to body size and shape, such as height and body mass index, as well as blood traits such as white blood cell count and red blood cell count, which also have a genetic basis.

    The researchers found that, compared to models trained only on European-ancestry individuals, their model’s predictions are more accurate for all genetic ancestry groups. The most notable gain was for people of African ancestry, who showed 61 percent average improvements, even though they only made up about 1.5 percent of samples in UK Biobank. The researchers also saw improvements of 11 percent for people of South Asian descent and 5 percent for white British people. Predictions for people of admixed ancestry improved by about 18 percent.

    “When you bring all the individuals together in the training set, everybody contributes to the training of the polygenic score modeling on equal footing,” Tanigawa says. “Combined with increasingly more inclusive data collection efforts, our method can help leverage these efforts to improve predictive accuracy for all.”

    The MIT team hopes its approach can eventually be incorporated into tests of an individual’s risk of a variety of diseases. Such tests could be combined with conventional risk factors and used to help doctors diagnose disease or to help people manage their risk for certain diseases before they develop.

    “Our work highlights the power of diversity, equity, and inclusion efforts in the context of genomics research,” Tanigawa says.

    The researchers now hope to add even more data to their model, including data from the United States, and to apply it to additional traits that they didn’t analyze in this study.

    “This is just the start,” Kellis says. “We can’t wait to see more people join our effort to propel inclusive human genetics research.”

    The research was funded by the National Institutes of Health. More

  • in

    Learning how to learn

    Suppose you need to be on today’s only ferry to Martha’s Vineyard, which leaves at 2 p.m. It takes about 30 minutes (on average) to drive from where you are to the terminal. What time should you leave?

    This is one of many common real-life examples used by Richard “Dick” Larson, a post-tenure professor in the MIT Institute for Data, Systems, and Society (IDSS), to explore exemplary problem-solving in his new book “Model Thinking for Everyday Life: How to Make Smarter Decisions.”

    Larson’s book synthesizes a lifelong career as an MIT professor and researcher, highlighting crucial skills underpinning all empirical, rational, and critical thinking. “Critical thinkers are energetic detectives … always seeking the facts,” he says. “Additional facts may surface that can result in modified conclusions … A critical thinker is aware of the pitfalls of human intuition.”

    For Larson, “model” thinking means not only thinking aided by conceptual and/or mathematical models, but a broader mode of critical thought that is informed by STEM concepts and worthy of emulation.

    In the ferry example, a key concept at play is uncertainty. Accounting for uncertainty is a core challenge faced by systems engineers, operations researchers, and modelers of complex networks — all hats Larson has worn in over half a century at MIT. 

    Uncertainty complicates all prediction and decision-making, and while statistics offers tactics for managing uncertainty, “Model Thinking” is not a math textbook. There are equations for the math-curious, but it doesn’t take a degree from MIT to understand that

    an average of 30 minutes would cover a range of times, some shorter, some longer;
    outliers can exist in the data, like the time construction traffic added an additional 30 minutes
    “about 30 minutes” is a prediction based on past experience, not current information (road closures, accidents, etc.); and
    the consequence for missing the ferry is not a delay of hours, but a full day — which might completely disrupt the trip or its purpose.
    And so, without doing much explicit math, you calculate variables, weigh the likelihood of different outcomes against the consequences of failure, and choose a departure time. Larson’s conclusion is one championed by dads everywhere: Leave on the earlier side, just in case. 

    “The world’s most important, invisible profession”

    Throughout Larson’s career at MIT, he has focused on the science of solving problems and making better decisions. “Faced with a new problem, people often lack the ability to frame and formulate it using basic principles,” argues Larson. “Our emphasis is on problem framing and formulation, with mathematics and physics playing supporting roles.”

    This is operations research, which Larson calls “the world’s most important invisible profession.” Formalized as a field during World War II, operations researchers use data and models to try to derive the “physics” of complex systems. The goal is typically optimizing things like scheduling, routing, simulation, prediction, planning, logistics, and queueing, for which Larson is especially well-known. A frequent media expert on the subject, he earned the moniker “Dr. Q” — and his research has led to new approaches for easing congestion in urban traffic, fast-food lines, and banks.

    Larson’s experience with complex systems provides a wealth of examples to draw on, but he is keen to demonstrate that his purview includes everyday decisions, and that “Model Thinking” is a book for everyone. 

    “Everybody uses models, whether they realize it or not,” he says. “If you have a bunch of errands to do, and you try to plan out the order to do them so you don’t have to drive as much, that’s more or less the ‘traveling salesman’ problem, a classic from operations research. Or when someone is shopping for groceries and thinking about how much of each product they need — they’re basically using an inventory management model of their pantry.”

    Larson’s takeaway is that since we all use conceptual models for thinking, planning, and decision-making, then understanding how our minds use models, and learning to use them more intentionally, can lead to clearer thinking, better planning, and smarter decision-making — especially when they are grounded in principles drawn from math and physics.

    Passion for the process

    Teaching STEM principles has long been a mission of Larson’s, who co-founded MIT BLOSSOMS (Blended Learning Open Source Science or Math Studies) with his late wife, Mary Elizabeth Murray. BLOSSOMS provides free, interactive STEM lessons and videos for primary school students around the world. Some of the exercises in “Model Thinking” refer to these videos as well.

    “A child’s educational opportunities shouldn’t be limited by where they were born or the wealth of their parents,” says Larson of the enterprise. 

    It was also Murray who encouraged Larson to write “Model Thinking.” “She saw how excited I was about it,” he says. “I had the choice of writing a textbook on queuing, say, or something else. It didn’t excite me at all.”

    Larson’s passion is for the process, not the answer. Throughout the book, he marks off opportunities for active learning with an icon showing the two tools necessary to complete each task: a sharpened pencil and a blank sheet of paper. 

    “Many of us in the age of instant Google searches have lost the ability — or perhaps the patience — to undertake multistep problems,” he argues.

    Model thinkers, on the other hand, understand and remember solutions better for having thought through the steps, and can better apply what they’ve learned to future problems. Larson’s “homework” is to do critical thinking, not just read about it. By working through thought experiments and scenarios, readers can achieve a deeper understanding of concepts like selection bias, random incidence, and orders of magnitude, all of which can present counterintuitive examples to the uninitiated.

    For Larson, who jokes that he is “an evangelist for models,” there is no better way to learn than by doing — except perhaps to teach. “Teaching a difficult topic is our best way to learn it ourselves, is an unselfish act, and bonds the teacher and learner,” he writes.

    In his long career as an educator and education advocate, Larson says he has always remained a learner himself. His love for learning illuminates every page of “Model Thinking,” which he hopes will provide others with the enjoyment and satisfaction that comes from learning new things and solving complex problems.

    “You will learn how to learn,” Larson says. “And you will enjoy it!” More

  • in

    A new way to integrate data with physical objects

    To get a sense of what StructCode is all about, says Mustafa Doğa Doğan, think of Superman. Not the “faster than a speeding bullet” and “more powerful than a locomotive” version, but a Superman, or Superwoman, who sees the world differently from ordinary mortals — someone who can look around a room and glean all kinds of information about ordinary objects that is not apparent to people with less penetrating faculties.

    That, in a nutshell, is “the high-level idea behind StructCode,” explains Doğan, a PhD student in electrical engineering and computer science at MIT and an affiliate of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). “The goal is to change the way we interact with objects” — to make those interactions more meaningful and more meaning-laden — “by embedding information into objects in ways that can be readily accessed.”

    StructCode grew out of an effort called InfraredTags, which Doğan and other colleagues introduced in 2022. That work, as well as the current project, was carried out in the laboratory of MIT Associate Professor Stefanie Mueller — Doğan’s advisor, who has taken part in both projects. In last year’s approach, “invisible” tags — that can only be seen with cameras capable of detecting infrared light — were used to reveal information about physical objects. The drawback there was that many cameras cannot perceive infrared light. Moreover, the method for fabricating these objects and affixing the tags to their surfaces relied on 3D printers, which tend to be very slow and often can only make objects that are small.

    StructCode, at least in its original version, relies on objects produced with laser-cutting techniques that can be manufactured within minutes, rather than the hours it might take on a 3D printer. Information can be extracted from these objects, moreover, with the RGB cameras that are commonly found in smartphones; the ability to operate in the infrared range of the spectrum is not required.

    In their initial demonstrations of the idea, the MIT-led team decided to construct their objects out of wood, making pieces such as furniture, picture frames, flowerpots, or toys that are well suited to laser-cut fabrication. A key question that had to be resolved was this: How can information be stored in a way that is unobtrusive and durable, as compared to externally-attached bar codes and QR codes, and also will not undermine an object’s structural integrity?

    The solution that the team has come up with, for now, is to rely on joints, which are ubiquitous in wooden objects made out of more than one component. Perhaps the most familiar is the finger joint, which has a kind of zigzag pattern whereby two wooden pieces adjoin at right angles such that every protruding “finger” along the joint of the first piece fits into a corresponding “gap” in the joint of the second piece and, similarly, every gap in the joint of the first piece is filled with a finger from the second.

    “Joints have these repeating features, which are like repeating bits,” Dogan says. To create a code, the researchers slightly vary the length of the gaps or fingers. A standard size length is accorded a 1. A slightly shorter length is assigned a 0, and a slightly longer length is assigned a 2. The encoding scheme is based on the sequence of these numbers, or bits, that can be observed along a joint. For every string of four bits, there are 81 (34) possible variations.

    The team also demonstrated ways of encoding messages in “living hinges” — a kind of joint that is made by taking a flat, rigid piece of material and making it bendable by cutting a series of parallel, vertical lines. As with the finger joints, the distance between these lines can be varied: 1 being the standard length, 0 being a slightly shorter length, and 2 being slightly longer. And in this way, a code can be assembled from an object that contains a living hinge.

    The idea is described in a paper, “StructCode: Leveraging Fabrication Artifacts to Store Data in Laser-Cut Objects,” that was presented this month at the 2023 ACM Symposium on Computational Fabrication in New York City. Doğan, the paper’s first author, is joined by Mueller and four coauthors — recent MIT alumna Grace Tang ’23, MNG ’23; MIT undergraduate Richard Qi; University of California at Berkeley graduate student Vivian Hsinyueh Chan; and Cornell University Assistant Professor Thijs Roumen.

    “In the realm of materials and design, there is often an inclination to associate novelty and innovation with entirely new materials or manufacturing techniques,” notes Elvin Karana, a professor of materials innovation and design at the Delft University of Technology. One of the things that impresses Karana most about StructCode is that it provides a novel means of storing data by “applying a commonly used technique like laser cutting and a material as ubiquitous as wood.”

    The idea for StructCode, adds University of Colorado computer scientist Ellen Yi-Luen Do, “is “simple, elegant, and totally makes sense. It’s like having the Rosetta Stone to help decipher Egyptian hieroglyphs.”

    Patrick Baudisch, a computer scientist at the Hasso Plattner Institute in Germany, views StructCode as “a great step forward for personal fabrication. It takes a key piece of functionality that’s only offered today for mass-produced goods and brings it to custom objects.”

    Here, in brief, is how it works: First, a laser cutter — guided by a model created via StructCode — fabricates an object into which encoded information has been embedded. After downloading a StructCode app, an user can decode the hidden message by pointing a cellphone camera at the object, which can (aided by StructCode software) detect subtle variations in length found in an object’s outward-facing joints or living hinges.

    The process is even easier if the user is equipped with augmented reality glasses, Doğan says. “In that case, you don’t need to point a camera. The information comes up automatically.” And that can give people more of the “superpowers” that the designers of StructCode hope to confer.

    “The object doesn’t need to contain a lot of information,” Doğan adds. “Just enough — in the form of, say, URLs — to direct people to places they can find out what they need to know.”

    Users might be sent to a website where they can obtain information about the object — how to care for it, and perhaps eventually how to disassemble it and recycle (or safely dispose of) its contents. A flowerpot that was made with living hinges might inform a user, based on records that are maintained online, as to when the plant inside the pot was last watered and when it needs to be watered again. Children examining a toy crocodile could, through StructCode, learn scientific details about various parts of the animal’s anatomy. A picture frame made with finger joints modified by StructCode could help people find out about the painting inside the frame and about the person (or persons) who created the artwork — perhaps linking to a video of an artist talking about this work directly.

    “This technique could pave the way for new applications, such as interactive museum exhibits,” says Raf Ramakers, a computer scientist at Hasselt University in Belgium. “It holds the potential for broadening the scope of how we perceive and interact with everyday objects” — which is precisely the goal that motivates the work of Doğan and his colleagues.

    But StructCode is not the end of the line, as far as Doğan and his collaborators are concerned. The same general approach could be adapted to other manufacturing techniques besides laser cutting, and information storage doesn’t have to be confined to the joints of wooden objects. Data could be represented, for instance, in the texture of leather, within the pattern of woven or knitted pieces, or concealed by other means within an image. Doğan is excited by the breadth of available options and by the fact that their “explorations into this new realm of possibilities, designed to make objects and our world more interactive, are just beginning.” More

  • in

    AI copilot enhances human precision for safer aviation

    Imagine you’re in an airplane with two pilots, one human and one computer. Both have their “hands” on the controllers, but they’re always looking out for different things. If they’re both paying attention to the same thing, the human gets to steer. But if the human gets distracted or misses something, the computer quickly takes over.

    Meet the Air-Guardian, a system developed by researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). As modern pilots grapple with an onslaught of information from multiple monitors, especially during critical moments, Air-Guardian acts as a proactive copilot; a partnership between human and machine, rooted in understanding attention.

    But how does it determine attention, exactly? For humans, it uses eye-tracking, and for the neural system, it relies on something called “saliency maps,” which pinpoint where attention is directed. The maps serve as visual guides highlighting key regions within an image, aiding in grasping and deciphering the behavior of intricate algorithms. Air-Guardian identifies early signs of potential risks through these attention markers, instead of only intervening during safety breaches like traditional autopilot systems. 

    The broader implications of this system reach beyond aviation. Similar cooperative control mechanisms could one day be used in cars, drones, and a wider spectrum of robotics.

    “An exciting feature of our method is its differentiability,” says MIT CSAIL postdoc Lianhao Yin, a lead author on a new paper about Air-Guardian. “Our cooperative layer and the entire end-to-end process can be trained. We specifically chose the causal continuous-depth neural network model because of its dynamic features in mapping attention. Another unique aspect is adaptability. The Air-Guardian system isn’t rigid; it can be adjusted based on the situation’s demands, ensuring a balanced partnership between human and machine.”

    In field tests, both the pilot and the system made decisions based on the same raw images when navigating to the target waypoint. Air-Guardian’s success was gauged based on the cumulative rewards earned during flight and shorter path to the waypoint. The guardian reduced the risk level of flights and increased the success rate of navigating to target points. 

    “This system represents the innovative approach of human-centric AI-enabled aviation,” adds Ramin Hasani, MIT CSAIL research affiliate and inventor of liquid neural networks. “Our use of liquid neural networks provides a dynamic, adaptive approach, ensuring that the AI doesn’t merely replace human judgment but complements it, leading to enhanced safety and collaboration in the skies.”

    The true strength of Air-Guardian is its foundational technology. Using an optimization-based cooperative layer using visual attention from humans and machine, and liquid closed-form continuous-time neural networks (CfC) known for its prowess in deciphering cause-and-effect relationships, it analyzes incoming images for vital information. Complementing this is the VisualBackProp algorithm, which identifies the system’s focal points within an image, ensuring clear understanding of its attention maps. 

    For future mass adoption, there’s a need to refine the human-machine interface. Feedback suggests an indicator, like a bar, might be more intuitive to signify when the guardian system takes control.

    Air-Guardian heralds a new age of safer skies, offering a reliable safety net for those moments when human attention wavers.

    “The Air-Guardian system highlights the synergy between human expertise and machine learning, furthering the objective of using machine learning to augment pilots in challenging scenarios and reduce operational errors,” says Daniela Rus, the Andrew (1956) and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT, director of CSAIL, and senior author on the paper.”One of the most interesting outcomes of using a visual attention metric in this work is the potential for allowing earlier interventions and greater interpretability by human pilots,” says Stephanie Gil, assistant professor of computer science at Harvard University, who was not involved in the work. “This showcases a great example of how AI can be used to work with a human, lowering the barrier for achieving trust by using natural communication mechanisms between the human and the AI system.”

    This research was partially funded by the U.S. Air Force (USAF) Research Laboratory, the USAF Artificial Intelligence Accelerator, the Boeing Co., and the Office of Naval Research. The findings don’t necessarily reflect the views of the U.S. government or the USAF. More

  • in

    Improving accessibility of online graphics for blind users

    The beauty of a nice infographic published alongside a news or magazine story is that it makes numeric data more accessible to the average reader. But for blind and visually impaired users, such graphics often have the opposite effect.

    For visually impaired users — who frequently rely on screen-reading software that speaks words or numbers aloud as the user moves a cursor across the screen — a graphic may be nothing more than a few words of alt text, such as a chart’s title. For instance, a map of the United States displaying population rates by county might have alt text in the HTML that says simply, “A map of the United States with population rates by county.” The data has been buried in an image, making it entirely inaccessible.

    “Charts have these various visual features that, as a [sighted] reader, you can shift your attention around, look at high-level patterns, look at individual data points, and you can do this on the fly,” says Jonathan Zong, a 2022 MIT Morningside Academy for Design (MAD) Fellow and PhD student in computer science, who points out that even when a graphic includes alt text that interprets the data, the visually impaired user must accept the findings as presented.

    “If you’re [blind and] using a screen reader, the text description imposes a linear predefined reading order. So, you’re beholden to the decisions that the person who wrote the text made about what information was important to include.”

    While some graphics do include data tables that a screen reader can read, it requires the user to remember all the data from each row and column as they move on to the next one. According to the National Federation of the Blind, Zong says, there are 7 million people living in the United States with visual disabilities, and nearly 97 percent of top-level pages on the internet are not accessible to screen readers. The problem, he points out, is an especially difficult one for blind researchers to get around. Some researchers with visual impairments rely on a sighted collaborator to read and help interpret graphics in peer-reviewed research.

    Working with the Visualization Group at the Computer Science and Artificial Intelligence Lab (CSAIL) on a project led by Associate Professor Arvind Satyanarayan that includes Daniel Hajas, a blind researcher and innovation manager at the Global Disability Innovation Hub in England, Zong and others have written an open-source Javascript software program named Olli that solves this problem when it’s included on a website. Olli is able to go from big-picture analysis of a chart to the finest grain of detail to give the user the ability to select the degree of granularity that interests them.

    “We want to design richer screen-reader experiences for visualization with a hierarchical structure, multiple ways to navigate, and descriptions at varying levels of granularity to provide self-guided, open-ended exploration for the user.”

    Next steps with Olli are incorporating multi-sensory software to integrate text and visuals with sound, such as having a musical note that moves up or down the harmonic scale to indicate the direction of data on a linear graph, and possibly even developing tactile interpretations of data. Like most of the MAD Fellows, Zong integrates his science and engineering skills with design and art to create solutions to real-world problems affecting individuals. He’s been recognized for his work in both the visual arts and computer science. He holds undergraduate degrees in computer science and visual arts with a focus on graphic design from Princeton University, where his research was on the ethics of data collection.

    “The throughline is the idea that design can help us make progress on really tough social and ethical questions,” Zong says, calling software for accessible data visualization an “intellectually rich area for design.” “We’re thinking about ways to translate charts and graphs into text descriptions that can get read aloud as speech, or thinking about other kinds of audio mappings to sonify data, and we’re even exploring some tactile methods to understand data,” he says.

    “I get really excited about design when it’s a way to both create things that are useful to people in everyday life and also make progress on larger conversations about technology and society. I think working in accessibility is a great way to do that.”

    Another problem at the intersection of technology and society is the ethics of taking user data from social media for large-scale studies without the users’ awareness. While working as a summer graduate research fellow at Cornell’s Citizens and Technology Lab, Zong helped create an open-source software called Bartleby that can be used in large anonymous data research studies. After researchers collect data, but before analysis, Bartleby would automatically send an email message to every user whose data was included, alert them to that fact and offer them the choice to review the resulting data table and opt out of the study. Bartleby was honored in the student category of Fast Company’s Innovation by Design Awards for 2022. In November the same year, Forbes magazine named Jonathan Zong in its Forbes 30 Under 30 in Science 2023 list for his work in data visualization accessibility.

    The underlying theme to all Zong’s work is the exploration of autonomy and agency, even in his artwork, which is heavily inclusive of text and semiotic play. In “Public Display,” he created a handmade digital display font by erasing parts of celebrity faces that were taken from a facial recognition dataset. The piece was exhibited in 2020 in MIT’s Wiesner Gallery, and received the third-place prize in the MIT Schnitzer Prize in the Visual Arts that year. The work deals not only with the neurological aspects of distinguishing faces from typefaces, but also with the implications for erasing individuals’ identities through the practice of using facial recognition programs that often target individuals in communities of color in unfair ways. Another of his works, “Biometric Sans,” a typography system that stretches letters based on a person’s typing speed, will be included in a show at the Harvard Science Center sometime next fall.

    “MAD, particularly the large events MAD jointly hosted, played a really important function in showing the rest of MIT that this is the kind of work we value. This is what design can look like and is capable of doing. I think it all contributes to that culture shift where this kind of interdisciplinary work can be valued, recognized, and serve the public.

    “There are shared ideas around embodiment and representation that tie these different pursuits together for me,” Zong says. “In the ethics work, and the art on surveillance, I’m thinking about whether data collectors are representing people the way they want to be seen through data. And similarly, the accessibility work is about whether we can make systems that are flexible to the way people want to use them.” More

  • in

    A more effective experimental design for engineering a cell into a new state

    A strategy for cellular reprogramming involves using targeted genetic interventions to engineer a cell into a new state. The technique holds great promise in immunotherapy, for instance, where researchers could reprogram a patient’s T-cells so they are more potent cancer killers. Someday, the approach could also help identify life-saving cancer treatments or regenerative therapies that repair disease-ravaged organs.

    But the human body has about 20,000 genes, and a genetic perturbation could be on a combination of genes or on any of the over 1,000 transcription factors that regulate the genes. Because the search space is vast and genetic experiments are costly, scientists often struggle to find the ideal perturbation for their particular application.   

    Researchers from MIT and Harvard University developed a new, computational approach that can efficiently identify optimal genetic perturbations based on a much smaller number of experiments than traditional methods.

    Their algorithmic technique leverages the cause-and-effect relationship between factors in a complex system, such as genome regulation, to prioritize the best intervention in each round of sequential experiments.

    The researchers conducted a rigorous theoretical analysis to determine that their technique did, indeed, identify optimal interventions. With that theoretical framework in place, they applied the algorithms to real biological data designed to mimic a cellular reprogramming experiment. Their algorithms were the most efficient and effective.

    “Too often, large-scale experiments are designed empirically. A careful causal framework for sequential experimentation may allow identifying optimal interventions with fewer trials, thereby reducing experimental costs,” says co-senior author Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS) who is also co-director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS) and Institute for Data, Systems and Society (IDSS).

    Joining Uhler on the paper, which appears today in Nature Machine Intelligence, are lead author Jiaqi Zhang, a graduate student and Eric and Wendy Schmidt Center Fellow; co-senior author Themistoklis P. Sapsis, professor of mechanical and ocean engineering at MIT and a member of IDSS; and others at Harvard and MIT.

    Active learning

    When scientists try to design an effective intervention for a complex system, like in cellular reprogramming, they often perform experiments sequentially. Such settings are ideally suited for the use of a machine-learning approach called active learning. Data samples are collected and used to learn a model of the system that incorporates the knowledge gathered so far. From this model, an acquisition function is designed — an equation that evaluates all potential interventions and picks the best one to test in the next trial.

    This process is repeated until an optimal intervention is identified (or resources to fund subsequent experiments run out).

    “While there are several generic acquisition functions to sequentially design experiments, these are not effective for problems of such complexity, leading to very slow convergence,” Sapsis explains.

    Acquisition functions typically consider correlation between factors, such as which genes are co-expressed. But focusing only on correlation ignores the regulatory relationships or causal structure of the system. For instance, a genetic intervention can only affect the expression of downstream genes, but a correlation-based approach would not be able to distinguish between genes that are upstream or downstream.

    “You can learn some of this causal knowledge from the data and use that to design an intervention more efficiently,” Zhang explains.

    The MIT and Harvard researchers leveraged this underlying causal structure for their technique. First, they carefully constructed an algorithm so it can only learn models of the system that account for causal relationships.

    Then the researchers designed the acquisition function so it automatically evaluates interventions using information on these causal relationships. They crafted this function so it prioritizes the most informative interventions, meaning those most likely to lead to the optimal intervention in subsequent experiments.

    “By considering causal models instead of correlation-based models, we can already rule out certain interventions. Then, whenever you get new data, you can learn a more accurate causal model and thereby further shrink the space of interventions,” Uhler explains.

    This smaller search space, coupled with the acquisition function’s special focus on the most informative interventions, is what makes their approach so efficient.

    The researchers further improved their acquisition function using a technique known as output weighting, inspired by the study of extreme events in complex systems. This method carefully emphasizes interventions that are likely to be closer to the optimal intervention.

    “Essentially, we view an optimal intervention as an ‘extreme event’ within the space of all possible, suboptimal interventions and use some of the ideas we have developed for these problems,” Sapsis says.    

    Enhanced efficiency

    They tested their algorithms using real biological data in a simulated cellular reprogramming experiment. For this test, they sought a genetic perturbation that would result in a desired shift in average gene expression. Their acquisition functions consistently identified better interventions than baseline methods through every step in the multi-stage experiment.

    “If you cut the experiment off at any stage, ours would still be more efficient than the baselines. This means you could run fewer experiments and get the same or better results,” Zhang says.

    The researchers are currently working with experimentalists to apply their technique toward cellular reprogramming in the lab.

    Their approach could also be applied to problems outside genomics, such as identifying optimal prices for consumer products or enabling optimal feedback control in fluid mechanics applications.

    In the future, they plan to enhance their technique for optimizations beyond those that seek to match a desired mean. In addition, their method assumes that scientists already understand the causal relationships in their system, but future work could explore how to use AI to learn that information, as well.

    This work was funded, in part, by the Office of Naval Research, the MIT-IBM Watson AI Lab, the MIT J-Clinic for Machine Learning and Health, the Eric and Wendy Schmidt Center at the Broad Institute, a Simons Investigator Award, the Air Force Office of Scientific Research, and a National Science Foundation Graduate Fellowship. More

  • in

    MIT welcomes nine MLK Visiting Professors and Scholars for 2023-24

    Established in 1990, the MLK Visiting Professors and Scholars Program at MIT welcomes outstanding scholars to the Institute for visiting appointments. MIT aspires to attract candidates who are, in the words of Martin Luther King Jr., “trailblazers in human, academic, scientific and religious freedom.” The program honors King’s life and legacy by expanding and extending the reach of our community. 

    The MLK Scholars Program has welcomed more than 140 professors, practitioners, and professionals at the forefront of their respective fields to MIT. They contribute to the growth and enrichment of the community through their interactions with students, staff, and faculty. They pay tribute to Martin Luther King Jr.’s life and legacy of service and social justice, and they embody MIT’s values: excellence and curiosity, openness and respect, and belonging and community.  

    Each new cohort of scholars actively participates in community engagement and supports MIT’s mission of “advancing knowledge and educating students in science, technology, and other areas of scholarship that will best serve the nation and the world in the 21st century.” 

    The 2023-2024 MLK Scholars:

    Tawanna Dillahunt is an associate professor at the University of Michigan’s School of Information with a joint appointment in their electrical engineering and computer science department. She is joining MIT at the end of a one-year visiting appointment as a Harvard Radcliffe Fellow. Her faculty hosts at the Institute are Catherine D’Ignazio in the Department of Urban Studies and Planning and Fotini Christia in the Institute for Data, Systems, and Society (IDSS). Dillahunt’s research focuses on equitable and inclusive computing. During her appointment, she will host a podcast to explore ethical and socially responsible ways to engage with communities, with a special emphasis on technology. 

    Kwabena Donkor is an assistant professor of marketing at Stanford Graduate School of Business; he is hosted by Dean Eckles, an associate professor of marketing at MIT Sloan School of Management. Donkor’s work bridges economics, psychology, and marketing. His scholarship combines insights from behavioral economics with data and field experiments to study social norms, identity, and how these constructs interact with policy in the marketplace.

    Denise Frazier joins MIT from Tulane University, where she is an assistant director in the New Orleans Center for the Gulf South. She is a researcher and performer and brings a unique interdisciplinary approach to her work at the intersection of cultural studies, environmental justice, and music. Frazier is hosted by Christine Ortiz, the Morris Cohen Professor in the Department of Materials Science and Engineering. 

    Wasalu Jaco, an accomplished performer and artist, is renewing his appointment at MIT for a second year; he is hosted jointly by Nick Montfort, a professor of digital media in the Comparative Media Studies Program/Writing, and Mary Fuller, a professor in the Literature Section and the current chair of the MIT faculty. In his second year, Jaco will work on Cyber/Cypher Rapper, a research project to develop a computational system that participates in responsive and improvisational rap.

    Morgane Konig first joined the Center for Theoretical Physics at MIT in December 2021 as a postdoc. Now a member of the 2023–24 MLK Visiting Scholars Program cohort, she will deepen her ties with scholars and research groups working in cosmology, primarily on early-universe inflation and late-universe signatures that could enable the scientific community to learn more about the mysterious nature of dark matter and dark energy. Her faculty hosts are David Kaiser, the Germeshausen Professor of the History of Science and professor of physics, and Alan Guth, the Victor F. Weisskopf Professor of Physics, both from the Department of Physics.

    The former minister of culture for Colombia and a transformational leader dedicated to environmental protection, Angelica Mayolo-Obregon joins MIT from Buenaventura, Colombia. During her time at MIT, she will serve as an advisor and guest speaker, and help MIT facilitate gatherings of environmental leaders committed to addressing climate action and conserving biodiversity across the Americas, with a special emphasis on Afro-descendant communities. Mayolo-Obregon is hosted by John Fernandez, a professor of building technology in the Department of Architecture and director of MIT’s Environmental Solutions Initiative, and by J. Phillip Thompson, an associate professor in the Department of Urban Studies and Planning (and a former MLK Scholar).

    Jean-Luc Pierite is a member of the Tunica-Biloxi Tribe of Louisiana and the president of the board of directors of North American Indian Center of Boston. While at MIT, Pierite will build connections between MIT and the local Indigenous communities. His research focuses on enhancing climate resilience planning by infusing Indigenous knowledge and ecological practices into scientific and other disciplines. His faculty host is Janelle Knox-Hayes, the Lister Brothers Professor of Economic Geography and Planning in the Department of Urban Studies and Planning.

    Christine Taylor-Butler ’81 is a children’s book author who has written over 90 books; she is hosted by Graham Jones, an associate professor of anthropology. An advocate for literacy and STEAM education in underserved urban and rural schools, Taylor-Butler will partner with community organizations in the Boston area. She is also completing the fourth installment of her middle-grade series, “The Lost Tribe.” These books follow a team of five kids as they use science and technology to crack codes and solve mysteries.

    Angelino Viceisza, a professor of economics at Spelman College, joins MIT Sloan as an MLK Visiting Professor and the Phyllis Wallace Visiting Professor; he is hosted by Robert Gibbons, Sloan Distinguished Professor of Management, and Ray Reagans, Alfred P. Sloan Professor of Management, professor of organization studies, and associate dean for diversity, equity, and inclusion at MIT Sloan. Viceisza has strong, ongoing connections with MIT. His research focuses on remittances, retirement, and household finance in low-income countries and is relevant to public finance and financial economics, as well as the development and organizational economics communities at MIT. 

    Javit Drake, Moriba Jah, and Louis Massiah, members of last year’s cohort of MLK Scholars, will remain at MIT through the end of 2023.

    There are multiple opportunities throughout the year to meet our MLK Visiting Scholars and learn more about their research projects and their social impact. 

    For more information about the MLK Visiting Professors and Scholars Program and upcoming events, visit the website. More