More stories

  • in

    New software enables blind and low-vision users to create interactive, accessible charts

    A growing number of tools enable users to make online data representations, like charts, that are accessible for people who are blind or have low vision. However, most tools require an existing visual chart that can then be converted into an accessible format.

    This creates barriers that prevent blind and low-vision users from building their own custom data representations, and it can limit their ability to explore and analyze important information.

    A team of researchers from MIT and University College London (UCL) wants to change the way people think about accessible data representations.

    They created a software system called Umwelt (which means “environment” in German) that can enable blind and low-vision users to build customized, multimodal data representations without needing an initial visual chart.

    Umwelt, an authoring environment designed for screen-reader users, incorporates an editor that allows someone to upload a dataset and create a customized representation, such as a scatterplot, that can include three modalities: visualization, textual description, and sonification. Sonification involves converting data into nonspeech audio.

    The system, which can represent a variety of data types, includes a viewer that enables a blind or low-vision user to interactively explore a data representation, seamlessly switching between each modality to interact with data in a different way.

    The researchers conducted a study with five expert screen-reader users who found Umwelt to be useful and easy to learn. In addition to offering an interface that empowered them to create data representations — something they said was sorely lacking — the users said Umwelt could facilitate communication between people who rely on different senses.

    “We have to remember that blind and low-vision people aren’t isolated. They exist in these contexts where they want to talk to other people about data,” says Jonathan Zong, an electrical engineering and computer science (EECS) graduate student and lead author of a paper introducing Umwelt. “I am hopeful that Umwelt helps shift the way that researchers think about accessible data analysis. Enabling the full participation of blind and low-vision people in data analysis involves seeing visualization as just one piece of this bigger, multisensory puzzle.”

    Joining Zong on the paper are fellow EECS graduate students Isabella Pedraza Pineros and Mengzhu “Katie” Chen; Daniel Hajas, a UCL researcher who works with the Global Disability Innovation Hub; and senior author Arvind Satyanarayan, associate professor of computer science at MIT who leads the Visualization Group in the Computer Science and Artificial Intelligence Laboratory. The paper will be presented at the ACM Conference on Human Factors in Computing.

    De-centering visualization

    The researchers previously developed interactive interfaces that provide a richer experience for screen reader users as they explore accessible data representations. Through that work, they realized most tools for creating such representations involve converting existing visual charts.

    Aiming to decenter visual representations in data analysis, Zong and Hajas, who lost his sight at age 16, began co-designing Umwelt more than a year ago.

    At the outset, they realized they would need to rethink how to represent the same data using visual, auditory, and textual forms.

    “We had to put a common denominator behind the three modalities. By creating this new language for representations, and making the output and input accessible, the whole is greater than the sum of its parts,” says Hajas.

    To build Umwelt, they first considered what is unique about the way people use each sense.

    For instance, a sighted user can see the overall pattern of a scatterplot and, at the same time, move their eyes to focus on different data points. But for someone listening to a sonification, the experience is linear since data are converted into tones that must be played back one at a time.

    “If you are only thinking about directly translating visual features into nonvisual features, then you miss out on the unique strengths and weaknesses of each modality,” Zong adds.

    They designed Umwelt to offer flexibility, enabling a user to switch between modalities easily when one would better suit their task at a given time.

    To use the editor, one uploads a dataset to Umwelt, which employs heuristics to automatically creates default representations in each modality.

    If the dataset contains stock prices for companies, Umwelt might generate a multiseries line chart, a textual structure that groups data by ticker symbol and date, and a sonification that uses tone length to represent the price for each date, arranged by ticker symbol.

    The default heuristics are intended to help the user get started.

    “In any kind of creative tool, you have a blank-slate effect where it is hard to know how to begin. That is compounded in a multimodal tool because you have to specify things in three different representations,” Zong says.

    The editor links interactions across modalities, so if a user changes the textual description, that information is adjusted in the corresponding sonification. Someone could utilize the editor to build a multimodal representation, switch to the viewer for an initial exploration, then return to the editor to make adjustments.

    Helping users communicate about data

    To test Umwelt, they created a diverse set of multimodal representations, from scatterplots to multiview charts, to ensure the system could effectively represent different data types. Then they put the tool in the hands of five expert screen reader users.

    Study participants mostly found Umwelt to be useful for creating, exploring, and discussing data representations. One user said Umwelt was like an “enabler” that decreased the time it took them to analyze data. The users agreed that Umwelt could help them communicate about data more easily with sighted colleagues.

    “What stands out about Umwelt is its core philosophy of de-emphasizing the visual in favor of a balanced, multisensory data experience. Often, nonvisual data representations are relegated to the status of secondary considerations, mere add-ons to their visual counterparts. However, visualization is merely one aspect of data representation. I appreciate their efforts in shifting this perception and embracing a more inclusive approach to data science,” says JooYoung Seo, an assistant professor in the School of Information Sciences at the University of Illinois at Urbana-Champagne, who was not involved with this work.

    Moving forward, the researchers plan to create an open-source version of Umwelt that others can build upon. They also want to integrate tactile sensing into the software system as an additional modality, enabling the use of tools like refreshable tactile graphics displays.

    “In addition to its impact on end users, I am hoping that Umwelt can be a platform for asking scientific questions around how people use and perceive multimodal representations, and how we can improve the design beyond this initial step,” says Zong.

    This work was supported, in part, by the National Science Foundation and the MIT Morningside Academy for Design Fellowship. More

  • in

    Q&A: How refusal can be an act of design

    This month in the ACM Journal on Responsible Computing, MIT graduate student Jonathan Zong SM ’20 and co-author J. Nathan Matias SM ’13, PhD ’17 of the Cornell Citizens and Technology Lab examine how the notion of refusal can open new avenues in the field of data ethics. In their open-access report, “Data Refusal From Below: A Framework for Understanding, Evaluating, and Envisioning Refusal as Design,” the pair proposes a framework in four dimensions to map how individuals can say “no” to technology misuses. At the same time, the researchers argue that just like design, refusal is generative, and has the potential to create alternate futures.

    Zong, a PhD candidate in electrical engineering and computer science, 2022-23 MIT Morningside Academy for Design Design Fellow, and member of the MIT Visualization Group, describes his latest work in this Q&A.

    Q: How do you define the concept of “refusal,” and where does it come from?

    A: Refusal was developed in feminist and Indigenous studies. It’s this idea of saying “no,” without being given permission to say “no.” Scholars like Ruha Benjamin write about refusal in the context of surveillance, race, and bioethics, and talk about it as a necessary counterpart to consent. Others, like the authors of the “Feminist Data Manifest-No,” think of refusal as something that can help us commit to building better futures.

    Benjamin illustrates cases where the choice to refuse is not equally possible for everyone, citing examples involving genetic data and refugee screenings in the U.K. The imbalance of power in these situations underscores the broader concept of refusal, extending beyond rejecting specific options to challenging the entire set of choices presented.

    Q: What inspired you to work on the notion of refusal as an act of design?

    A: In my work on data ethics, I’ve been thinking about how to incorporate processes into research data collection, particularly around consent and opt-out, with a focus on individual autonomy and the idea of giving people choices about the way that their data is used. But when it comes to data privacy, simply making choices available is not enough. Choices can be unequally available, or create no-win situations where all options are bad. This led me to the concept of refusal: questioning the authority of data collectors and challenging their legitimacy.

    The key idea of my work is that refusal is an act of design. I think of refusal as deliberate actions to redesign our socio-technical landscape by exerting some sort of influence. Like design, refusal is generative. Like design, it’s oriented towards creating alternate possibilities and alternate futures. Design is a process of exploring or traversing a space of possibility. Applying a design framework to cases of refusal drawn from scholarly and journalistic sources allowed me to establish a common language for talking about refusal and to imagine refusals that haven’t been explored yet.

    Q: What are the stakes around data privacy and data collection?

    A: The use of data for facial recognition surveillance in the U.S. is a big example we use in the paper. When people do everyday things like post on social media or walk past cameras in public spaces, they might be contributing their data to training facial recognition systems. For instance, a tech company may take photos from a social media site and build facial recognition that they then sell to the government. In the U.S., these systems are disproportionately used by police to surveil communities of color. It is difficult to apply concepts like consent and opt out of these processes, because they happen over time and involve multiple kinds of institutions. It’s also not clear that individual opt-out would do anything to change the overall situation. Refusal then becomes a crucial avenue, at both individual and community levels, to think more broadly of how affected people still exert some kind of voice or agency, without necessarily having an official channel to do so.

    Q: Why do you think these issues are more particularly affecting disempowered communities?

    A: People who are affected by technologies are not always included in the design process for those technologies. Refusal then becomes a meaningful expression of values and priorities for those who were not part of the early design conversations. Actions taken against technologies like face surveillance — be it legal battles against companies, advocacy for stricter regulations, or even direct action like disabling security cameras — may not fit the conventional notion of participating in a design process. And yet, these are the actions available to refusers who may be excluded from other forms of participation.

    I’m particularly inspired by the movement around Indigenous data sovereignty. Organizations like the First Nations Information Governance Centre work towards prioritizing Indigenous communities’ perspectives in data collection, and refuse inadequate representation in official health data from the Canadian government. I think this is a movement that exemplifies the potential of refusal, not only as a way to reject what’s being offered, but also as a means to propose a constructive alternative, very much like design. Refusal is not merely a negation, but a pathway to different futures.

    Q: Can you elaborate on the design framework you propose?

    A: Refusals vary widely across contexts and scales. Developing a framework for refusal is about helping people see actions that are seemingly very different as instances of the same broader idea. Our framework consists of four facets: autonomy, time, power, and cost.

    Consider the case of IBM creating a facial recognition dataset using people’s photos without consent. We saw multiple forms of refusal emerge in response. IBM allowed individuals to opt out by withdrawing their photos. People collectively refused by creating a class-action lawsuit against IBM. Around the same time, many U.S. cities started passing local legislation banning the government use of facial recognition. Evaluating these cases through the framework highlights commonalities and differences. The framework highlights varied approaches to autonomy, like individual opt-out and collective action. Regarding time, opt-outs and lawsuits react to past harm, while legislation might proactively prevent future harm. Power dynamics differ; withdrawing individual photos minimally influences IBM, while legislation could potentially cause longer-term change. And as for cost, individual opt-out seems less demanding, while other approaches require more time and effort, balanced against potential benefits.

    The framework facilitates case description and comparison across these dimensions. I think its generative nature encourages exploration of novel forms of refusal as well. By identifying the characteristics we want to see in future refusal strategies — collective, proactive, powerful, low-cost… — we can aspire to shape future approaches and change the behavior of data collectors. We may not always be able to combine all these criteria, but the framework provides a means to articulate our aspirational goals in this context.

    Q: What impact do you hope this research will have?

    A: I hope to expand the notion of who can participate in design, and whose actions are seen as legitimate expressions of design input. I think a lot of work so far in the conversation around data ethics prioritizes the perspective of computer scientists who are trying to design better systems, at the expense of the perspective of people for whom the systems are not currently working. So, I hope designers and computer scientists can embrace the concept of refusal as a legitimate form of design, and a source of inspiration. There’s a vital conversation happening, one that should influence the design of future systems, even if expressed through unconventional means.

    One of the things I want to underscore in the paper is that design extends beyond software. Taking a socio-technical perspective, the act of designing encompasses software, institutions, relationships, and governance structures surrounding data use. I want people who aren’t software engineers, like policymakers or activists, to view themselves as integral to the technology design process. More

  • in

    AI generates high-quality images 30 times faster in a single step

    In our current age of artificial intelligence, computers can generate their own “art” by way of diffusion models, iteratively adding structure to a noisy initial state until a clear image or video emerges. Diffusion models have suddenly grabbed a seat at everyone’s table: Enter a few words and experience instantaneous, dopamine-spiking dreamscapes at the intersection of reality and fantasy. Behind the scenes, it involves a complex, time-intensive process requiring numerous iterations for the algorithm to perfect the image.

    MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers have introduced a new framework that simplifies the multi-step process of traditional diffusion models into a single step, addressing previous limitations. This is done through a type of teacher-student model: teaching a new computer model to mimic the behavior of more complicated, original models that generate images. The approach, known as distribution matching distillation (DMD), retains the quality of the generated images and allows for much faster generation. 

    “Our work is a novel method that accelerates current diffusion models such as Stable Diffusion and DALLE-3 by 30 times,” says Tianwei Yin, an MIT PhD student in electrical engineering and computer science, CSAIL affiliate, and the lead researcher on the DMD framework. “This advancement not only significantly reduces computational time but also retains, if not surpasses, the quality of the generated visual content. Theoretically, the approach marries the principles of generative adversarial networks (GANs) with those of diffusion models, achieving visual content generation in a single step — a stark contrast to the hundred steps of iterative refinement required by current diffusion models. It could potentially be a new generative modeling method that excels in speed and quality.”

    This single-step diffusion model could enhance design tools, enabling quicker content creation and potentially supporting advancements in drug discovery and 3D modeling, where promptness and efficacy are key.

    Distribution dreams

    DMD cleverly has two components. First, it uses a regression loss, which anchors the mapping to ensure a coarse organization of the space of images to make training more stable. Next, it uses a distribution matching loss, which ensures that the probability to generate a given image with the student model corresponds to its real-world occurrence frequency. To do this, it leverages two diffusion models that act as guides, helping the system understand the difference between real and generated images and making training the speedy one-step generator possible.

    The system achieves faster generation by training a new network to minimize the distribution divergence between its generated images and those from the training dataset used by traditional diffusion models. “Our key insight is to approximate gradients that guide the improvement of the new model using two diffusion models,” says Yin. “In this way, we distill the knowledge of the original, more complex model into the simpler, faster one, while bypassing the notorious instability and mode collapse issues in GANs.” 

    Yin and colleagues used pre-trained networks for the new student model, simplifying the process. By copying and fine-tuning parameters from the original models, the team achieved fast training convergence of the new model, which is capable of producing high-quality images with the same architectural foundation. “This enables combining with other system optimizations based on the original architecture to further accelerate the creation process,” adds Yin. 

    When put to the test against the usual methods, using a wide range of benchmarks, DMD showed consistent performance. On the popular benchmark of generating images based on specific classes on ImageNet, DMD is the first one-step diffusion technique that churns out pictures pretty much on par with those from the original, more complex models, rocking a super-close Fréchet inception distance (FID) score of just 0.3, which is impressive, since FID is all about judging the quality and diversity of generated images. Furthermore, DMD excels in industrial-scale text-to-image generation and achieves state-of-the-art one-step generation performance. There’s still a slight quality gap when tackling trickier text-to-image applications, suggesting there’s a bit of room for improvement down the line. 

    Additionally, the performance of the DMD-generated images is intrinsically linked to the capabilities of the teacher model used during the distillation process. In the current form, which uses Stable Diffusion v1.5 as the teacher model, the student inherits limitations such as rendering detailed depictions of text and small faces, suggesting that DMD-generated images could be further enhanced by more advanced teacher models. 

    “Decreasing the number of iterations has been the Holy Grail in diffusion models since their inception,” says Fredo Durand, MIT professor of electrical engineering and computer science, CSAIL principal investigator, and a lead author on the paper. “We are very excited to finally enable single-step image generation, which will dramatically reduce compute costs and accelerate the process.” 

    “Finally, a paper that successfully combines the versatility and high visual quality of diffusion models with the real-time performance of GANs,” says Alexei Efros, a professor of electrical engineering and computer science at the University of California at Berkeley who was not involved in this study. “I expect this work to open up fantastic possibilities for high-quality real-time visual editing.” 

    Yin and Durand’s fellow authors are MIT electrical engineering and computer science professor and CSAIL principal investigator William T. Freeman, as well as Adobe research scientists Michaël Gharbi SM ’15, PhD ’18; Richard Zhang; Eli Shechtman; and Taesung Park. Their work was supported, in part, by U.S. National Science Foundation grants (including one for the Institute for Artificial Intelligence and Fundamental Interactions), the Singapore Defense Science and Technology Agency, and by funding from Gwangju Institute of Science and Technology and Amazon. Their work will be presented at the Conference on Computer Vision and Pattern Recognition in June. More

  • in

    Using generative AI to improve software testing

    Generative AI is getting plenty of attention for its ability to create text and images. But those media represent only a fraction of the data that proliferate in our society today. Data are generated every time a patient goes through a medical system, a storm impacts a flight, or a person interacts with a software application.

    Using generative AI to create realistic synthetic data around those scenarios can help organizations more effectively treat patients, reroute planes, or improve software platforms — especially in scenarios where real-world data are limited or sensitive.

    For the last three years, the MIT spinout DataCebo has offered a generative software system called the Synthetic Data Vault to help organizations create synthetic data to do things like test software applications and train machine learning models.

    The Synthetic Data Vault, or SDV, has been downloaded more than 1 million times, with more than 10,000 data scientists using the open-source library for generating synthetic tabular data. The founders — Principal Research Scientist Kalyan Veeramachaneni and alumna Neha Patki ’15, SM ’16 — believe the company’s success is due to SDV’s ability to revolutionize software testing.

    SDV goes viral

    In 2016, Veeramachaneni’s group in the Data to AI Lab unveiled a suite of open-source generative AI tools to help organizations create synthetic data that matched the statistical properties of real data.

    Companies can use synthetic data instead of sensitive information in programs while still preserving the statistical relationships between datapoints. Companies can also use synthetic data to run new software through simulations to see how it performs before releasing it to the public.

    Veeramachaneni’s group came across the problem because it was working with companies that wanted to share their data for research.

    “MIT helps you see all these different use cases,” Patki explains. “You work with finance companies and health care companies, and all those projects are useful to formulate solutions across industries.”

    In 2020, the researchers founded DataCebo to build more SDV features for larger organizations. Since then, the use cases have been as impressive as they’ve been varied.

    With DataCebo’s new flight simulator, for instance, airlines can plan for rare weather events in a way that would be impossible using only historic data. In another application, SDV users synthesized medical records to predict health outcomes for patients with cystic fibrosis. A team from Norway recently used SDV to create synthetic student data to evaluate whether various admissions policies were meritocratic and free from bias.

    In 2021, the data science platform Kaggle hosted a competition for data scientists that used SDV to create synthetic data sets to avoid using proprietary data. Roughly 30,000 data scientists participated, building solutions and predicting outcomes based on the company’s realistic data.

    And as DataCebo has grown, it’s stayed true to its MIT roots: All of the company’s current employees are MIT alumni.

    Supercharging software testing

    Although their open-source tools are being used for a variety of use cases, the company is focused on growing its traction in software testing.

    “You need data to test these software applications,” Veeramachaneni says. “Traditionally, developers manually write scripts to create synthetic data. With generative models, created using SDV, you can learn from a sample of data collected and then sample a large volume of synthetic data (which has the same properties as real data), or create specific scenarios and edge cases, and use the data to test your application.”

    For example, if a bank wanted to test a program designed to reject transfers from accounts with no money in them, it would have to simulate many accounts simultaneously transacting. Doing that with data created manually would take a lot of time. With DataCebo’s generative models, customers can create any edge case they want to test.

    “It’s common for industries to have data that is sensitive in some capacity,” Patki says. “Often when you’re in a domain with sensitive data you’re dealing with regulations, and even if there aren’t legal regulations, it’s in companies’ best interest to be diligent about who gets access to what at which time. So, synthetic data is always better from a privacy perspective.”

    Scaling synthetic data

    Veeramachaneni believes DataCebo is advancing the field of what it calls synthetic enterprise data, or data generated from user behavior on large companies’ software applications.

    “Enterprise data of this kind is complex, and there is no universal availability of it, unlike language data,” Veeramachaneni says. “When folks use our publicly available software and report back if works on a certain pattern, we learn a lot of these unique patterns, and it allows us to improve our algorithms. From one perspective, we are building a corpus of these complex patterns, which for language and images is readily available. “

    DataCebo also recently released features to improve SDV’s usefulness, including tools to assess the “realism” of the generated data, called the SDMetrics library as well as a way to compare models’ performances called SDGym.

    “It’s about ensuring organizations trust this new data,” Veeramachaneni says. “[Our tools offer] programmable synthetic data, which means we allow enterprises to insert their specific insight and intuition to build more transparent models.”

    As companies in every industry rush to adopt AI and other data science tools, DataCebo is ultimately helping them do so in a way that is more transparent and responsible.

    “In the next few years, synthetic data from generative models will transform all data work,” Veeramachaneni says. “We believe 90 percent of enterprise operations can be done with synthetic data.” More

  • in

    Dealing with the limitations of our noisy world

    Tamara Broderick first set foot on MIT’s campus when she was a high school student, as a participant in the inaugural Women’s Technology Program. The monthlong summer academic experience gives young women a hands-on introduction to engineering and computer science.

    What is the probability that she would return to MIT years later, this time as a faculty member?

    That’s a question Broderick could probably answer quantitatively using Bayesian inference, a statistical approach to probability that tries to quantify uncertainty by continuously updating one’s assumptions as new data are obtained.

    In her lab at MIT, the newly tenured associate professor in the Department of Electrical Engineering and Computer Science (EECS) uses Bayesian inference to quantify uncertainty and measure the robustness of data analysis techniques.

    “I’ve always been really interested in understanding not just ‘What do we know from data analysis,’ but ‘How well do we know it?’” says Broderick, who is also a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society. “The reality is that we live in a noisy world, and we can’t always get exactly the data that we want. How do we learn from data but at the same time recognize that there are limitations and deal appropriately with them?”

    Broadly, her focus is on helping people understand the confines of the statistical tools available to them and, sometimes, working with them to craft better tools for a particular situation.

    For instance, her group recently collaborated with oceanographers to develop a machine-learning model that can make more accurate predictions about ocean currents. In another project, she and others worked with degenerative disease specialists on a tool that helps severely motor-impaired individuals utilize a computer’s graphical user interface by manipulating a single switch.

    A common thread woven through her work is an emphasis on collaboration.

    “Working in data analysis, you get to hang out in everybody’s backyard, so to speak. You really can’t get bored because you can always be learning about some other field and thinking about how we can apply machine learning there,” she says.

    Hanging out in many academic “backyards” is especially appealing to Broderick, who struggled even from a young age to narrow down her interests.

    A math mindset

    Growing up in a suburb of Cleveland, Ohio, Broderick had an interest in math for as long as she can remember. She recalls being fascinated by the idea of what would happen if you kept adding a number to itself, starting with 1+1=2 and then 2+2=4.

    “I was maybe 5 years old, so I didn’t know what ‘powers of two’ were or anything like that. I was just really into math,” she says.

    Her father recognized her interest in the subject and enrolled her in a Johns Hopkins program called the Center for Talented Youth, which gave Broderick the opportunity to take three-week summer classes on a range of subjects, from astronomy to number theory to computer science.

    Later, in high school, she conducted astrophysics research with a postdoc at Case Western University. In the summer of 2002, she spent four weeks at MIT as a member of the first class of the Women’s Technology Program.

    She especially enjoyed the freedom offered by the program, and its focus on using intuition and ingenuity to achieve high-level goals. For instance, the cohort was tasked with building a device with LEGOs that they could use to biopsy a grape suspended in Jell-O.

    The program showed her how much creativity is involved in engineering and computer science, and piqued her interest in pursuing an academic career.

    “But when I got into college at Princeton, I could not decide — math, physics, computer science — they all seemed super-cool. I wanted to do all of it,” she says.

    She settled on pursuing an undergraduate math degree but took all the physics and computer science courses she could cram into her schedule.

    Digging into data analysis

    After receiving a Marshall Scholarship, Broderick spent two years at Cambridge University in the United Kingdom, earning a master of advanced study in mathematics and a master of philosophy in physics.

    In the UK, she took a number of statistics and data analysis classes, including her first class on Bayesian data analysis in the field of machine learning.

    It was a transformative experience, she recalls.

    “During my time in the U.K., I realized that I really like solving real-world problems that matter to people, and Bayesian inference was being used in some of the most important problems out there,” she says.

    Back in the U.S., Broderick headed to the University of California at Berkeley, where she joined the lab of Professor Michael I. Jordan as a grad student. She earned a PhD in statistics with a focus on Bayesian data analysis. 

    She decided to pursue a career in academia and was drawn to MIT by the collaborative nature of the EECS department and by how passionate and friendly her would-be colleagues were.

    Her first impressions panned out, and Broderick says she has found a community at MIT that helps her be creative and explore hard, impactful problems with wide-ranging applications.

    “I’ve been lucky to work with a really amazing set of students and postdocs in my lab — brilliant and hard-working people whose hearts are in the right place,” she says.

    One of her team’s recent projects involves a collaboration with an economist who studies the use of microcredit, or the lending of small amounts of money at very low interest rates, in impoverished areas.

    The goal of microcredit programs is to raise people out of poverty. Economists run randomized control trials of villages in a region that receive or don’t receive microcredit. They want to generalize the study results, predicting the expected outcome if one applies microcredit to other villages outside of their study.

    But Broderick and her collaborators have found that results of some microcredit studies can be very brittle. Removing one or a few data points from the dataset can completely change the results. One issue is that researchers often use empirical averages, where a few very high or low data points can skew the results.

    Using machine learning, she and her collaborators developed a method that can determine how many data points must be dropped to change the substantive conclusion of the study. With their tool, a scientist can see how brittle the results are.

    “Sometimes dropping a very small fraction of data can change the major results of a data analysis, and then we might worry how far those conclusions generalize to new scenarios. Are there ways we can flag that for people? That is what we are getting at with this work,” she explains.

    At the same time, she is continuing to collaborate with researchers in a range of fields, such as genetics, to understand the pros and cons of different machine-learning techniques and other data analysis tools.

    Happy trails

    Exploration is what drives Broderick as a researcher, and it also fuels one of her passions outside the lab. She and her husband enjoy collecting patches they earn by hiking all the trails in a park or trail system.

    “I think my hobby really combines my interests of being outdoors and spreadsheets,” she says. “With these hiking patches, you have to explore everything and then you see areas you wouldn’t normally see. It is adventurous, in that way.”

    They’ve discovered some amazing hikes they would never have known about, but also embarked on more than a few “total disaster hikes,” she says. But each hike, whether a hidden gem or an overgrown mess, offers its own rewards.

    And just like in her research, curiosity, open-mindedness, and a passion for problem-solving have never led her astray. More

  • in

    New AI model could streamline operations in a robotic warehouse

    Hundreds of robots zip back and forth across the floor of a colossal robotic warehouse, grabbing items and delivering them to human workers for packing and shipping. Such warehouses are increasingly becoming part of the supply chain in many industries, from e-commerce to automotive production.

    However, getting 800 robots to and from their destinations efficiently while keeping them from crashing into each other is no easy task. It is such a complex problem that even the best path-finding algorithms struggle to keep up with the breakneck pace of e-commerce or manufacturing. 

    In a sense, these robots are like cars trying to navigate a crowded city center. So, a group of MIT researchers who use AI to mitigate traffic congestion applied ideas from that domain to tackle this problem.

    They built a deep-learning model that encodes important information about the warehouse, including the robots, planned paths, tasks, and obstacles, and uses it to predict the best areas of the warehouse to decongest to improve overall efficiency.

    Their technique divides the warehouse robots into groups, so these smaller groups of robots can be decongested faster with traditional algorithms used to coordinate robots. In the end, their method decongests the robots nearly four times faster than a strong random search method.

    In addition to streamlining warehouse operations, this deep learning approach could be used in other complex planning tasks, like computer chip design or pipe routing in large buildings.

    “We devised a new neural network architecture that is actually suitable for real-time operations at the scale and complexity of these warehouses. It can encode hundreds of robots in terms of their trajectories, origins, destinations, and relationships with other robots, and it can do this in an efficient manner that reuses computation across groups of robots,” says Cathy Wu, the Gilbert W. Winslow Career Development Assistant Professor in Civil and Environmental Engineering (CEE), and a member of a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society (IDSS).

    Wu, senior author of a paper on this technique, is joined by lead author Zhongxia Yan, a graduate student in electrical engineering and computer science. The work will be presented at the International Conference on Learning Representations.

    Robotic Tetris

    From a bird’s eye view, the floor of a robotic e-commerce warehouse looks a bit like a fast-paced game of “Tetris.”

    When a customer order comes in, a robot travels to an area of the warehouse, grabs the shelf that holds the requested item, and delivers it to a human operator who picks and packs the item. Hundreds of robots do this simultaneously, and if two robots’ paths conflict as they cross the massive warehouse, they might crash.

    Traditional search-based algorithms avoid potential crashes by keeping one robot on its course and replanning a trajectory for the other. But with so many robots and potential collisions, the problem quickly grows exponentially.

    “Because the warehouse is operating online, the robots are replanned about every 100 milliseconds. That means that every second, a robot is replanned 10 times. So, these operations need to be very fast,” Wu says.

    Because time is so critical during replanning, the MIT researchers use machine learning to focus the replanning on the most actionable areas of congestion — where there exists the most potential to reduce the total travel time of robots.

    Wu and Yan built a neural network architecture that considers smaller groups of robots at the same time. For instance, in a warehouse with 800 robots, the network might cut the warehouse floor into smaller groups that contain 40 robots each.

    Then, it predicts which group has the most potential to improve the overall solution if a search-based solver were used to coordinate trajectories of robots in that group.

    An iterative process, the overall algorithm picks the most promising robot group with the neural network, decongests the group with the search-based solver, then picks the next most promising group with the neural network, and so on.

    Considering relationships

    The neural network can reason about groups of robots efficiently because it captures complicated relationships that exist between individual robots. For example, even though one robot may be far away from another initially, their paths could still cross during their trips.

    The technique also streamlines computation by encoding constraints only once, rather than repeating the process for each subproblem. For instance, in a warehouse with 800 robots, decongesting a group of 40 robots requires holding the other 760 robots as constraints. Other approaches require reasoning about all 800 robots once per group in each iteration.

    Instead, the researchers’ approach only requires reasoning about the 800 robots once across all groups in each iteration.

    “The warehouse is one big setting, so a lot of these robot groups will have some shared aspects of the larger problem. We designed our architecture to make use of this common information,” she adds.

    They tested their technique in several simulated environments, including some set up like warehouses, some with random obstacles, and even maze-like settings that emulate building interiors.

    By identifying more effective groups to decongest, their learning-based approach decongests the warehouse up to four times faster than strong, non-learning-based approaches. Even when they factored in the additional computational overhead of running the neural network, their approach still solved the problem 3.5 times faster.

    In the future, the researchers want to derive simple, rule-based insights from their neural model, since the decisions of the neural network can be opaque and difficult to interpret. Simpler, rule-based methods could also be easier to implement and maintain in actual robotic warehouse settings.

    “This approach is based on a novel architecture where convolution and attention mechanisms interact effectively and efficiently. Impressively, this leads to being able to take into account the spatiotemporal component of the constructed paths without the need of problem-specific feature engineering. The results are outstanding: Not only is it possible to improve on state-of-the-art large neighborhood search methods in terms of quality of the solution and speed, but the model generalizes to unseen cases wonderfully,” says Andrea Lodi, the Andrew H. and Ann R. Tisch Professor at Cornell Tech, and who was not involved with this research.

    This work was supported by Amazon and the MIT Amazon Science Hub. More

  • in

    Automated method helps researchers quantify uncertainty in their predictions

    Pollsters trying to predict presidential election results and physicists searching for distant exoplanets have at least one thing in common: They often use a tried-and-true scientific technique called Bayesian inference.

    Bayesian inference allows these scientists to effectively estimate some unknown parameter — like the winner of an election — from data such as poll results. But Bayesian inference can be slow, sometimes consuming weeks or even months of computation time or requiring a researcher to spend hours deriving tedious equations by hand. 

    Researchers from MIT and elsewhere have introduced an optimization technique that speeds things up without requiring a scientist to do a lot of additional work. Their method can achieve more accurate results faster than another popular approach for accelerating Bayesian inference.

    Using this new automated technique, a scientist could simply input their model and then the optimization method does all the calculations under the hood to provide an approximation of some unknown parameter. The method also offers reliable uncertainty estimates that can help a researcher understand when to trust its predictions.

    This versatile technique could be applied to a wide array of scientific quandaries that incorporate Bayesian inference. For instance, it could be used by economists studying the impact of microcredit loans in developing nations or sports analysts using a model to rank top tennis players.

    “When you actually dig into what people are doing in the social sciences, physics, chemistry, or biology, they are often using a lot of the same tools under the hood. There are so many Bayesian analyses out there. If we can build a really great tool that makes these researchers lives easier, then we can really make a difference to a lot of people in many different research areas,” says senior author Tamara Broderick, an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS) and a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society.

    Broderick is joined on the paper by co-lead authors Ryan Giordano, an assistant professor of statistics at the University of California at Berkeley; and Martin Ingram, a data scientist at the AI company KONUX. The paper was recently published in the Journal of Machine Learning Research.

    Faster results

    When researchers seek a faster form of Bayesian inference, they often turn to a technique called automatic differentiation variational inference (ADVI), which is often both fast to run and easy to use.

    But Broderick and her collaborators have found a number of practical issues with ADVI. It has to solve an optimization problem and can do so only approximately. So, ADVI can still require a lot of computation time and user effort to determine whether the approximate solution is good enough. And once it arrives at a solution, it tends to provide poor uncertainty estimates.

    Rather than reinventing the wheel, the team took many ideas from ADVI but turned them around to create a technique called deterministic ADVI (DADVI) that doesn’t have these downsides.

    With DADVI, it is very clear when the optimization is finished, so a user won’t need to spend extra computation time to ensure that the best solution has been found. DADVI also permits the incorporation of more powerful optimization methods that give it an additional speed and performance boost.

    Once it reaches a result, DADVI is set up to allow the use of uncertainty corrections. These corrections make its uncertainty estimates much more accurate than those of ADVI.

    DADVI also enables the user to clearly see how much error they have incurred in the approximation to the optimization problem. This prevents a user from needlessly running the optimization again and again with more and more resources to try and reduce the error.

    “We wanted to see if we could live up to the promise of black-box inference in the sense of, once the user makes their model, they can just run Bayesian inference and don’t have to derive everything by hand, they don’t need to figure out when to stop their algorithm, and they have a sense of how accurate their approximate solution is,” Broderick says.

    Defying conventional wisdom

    DADVI can be more effective than ADVI because it uses an efficient approximation method, called sample average approximation, which estimates an unknown quantity by taking a series of exact steps.

    Because the steps along the way are exact, it is clear when the objective has been reached. Plus, getting to that objective typically requires fewer steps.

    Often, researchers expect sample average approximation to be more computationally intensive than a more popular method, known as stochastic gradient, which is used by ADVI. But Broderick and her collaborators showed that, in many applications, this is not the case.

    “A lot of problems really do have special structure, and you can be so much more efficient and get better performance by taking advantage of that special structure. That is something we have really seen in this paper,” she adds.

    They tested DADVI on a number of real-world models and datasets, including a model used by economists to evaluate the effectiveness of microcredit loans and one used in ecology to determine whether a species is present at a particular site.

    Across the board, they found that DADVI can estimate unknown parameters faster and more reliably than other methods, and achieves as good or better accuracy than ADVI. Because it is easier to use than other techniques, DADVI could offer a boost to scientists in a wide variety of fields.

    In the future, the researchers want to dig deeper into correction methods for uncertainty estimates so they can better understand why these corrections can produce such accurate uncertainties, and when they could fall short.

    “In applied statistics, we often have to use approximate algorithms for problems that are too complex or high-dimensional to allow exact solutions to be computed in reasonable time. This new paper offers an interesting set of theory and empirical results that point to an improvement in a popular existing approximate algorithm for Bayesian inference,” says Andrew Gelman ’85, ’86, a professor of statistics and political science at Columbia University, who was not involved with the study. “As one of the team involved in the creation of that earlier work, I’m happy to see our algorithm superseded by something more stable.”

    This research was supported by a National Science Foundation CAREER Award and the U.S. Office of Naval Research.  More

  • in

    Study: Global deforestation leads to more mercury pollution

    About 10 percent of human-made mercury emissions into the atmosphere each year are the result of global deforestation, according to a new MIT study.

    The world’s vegetation, from the Amazon rainforest to the savannahs of sub-Saharan Africa, acts as a sink that removes the toxic pollutant from the air. However, if the current rate of deforestation remains unchanged or accelerates, the researchers estimate that net mercury emissions will keep increasing.

    “We’ve been overlooking a significant source of mercury, especially in tropical regions,” says Ari Feinberg, a former postdoc in the Institute for Data, Systems, and Society (IDSS) and lead author of the study.

    The researchers’ model shows that the Amazon rainforest plays a particularly important role as a mercury sink, contributing about 30 percent of the global land sink. Curbing Amazon deforestation could thus have a substantial impact on reducing mercury pollution.

    The team also estimates that global reforestation efforts could increase annual mercury uptake by about 5 percent. While this is significant, the researchers emphasize that reforestation alone should not be a substitute for worldwide pollution control efforts.

    “Countries have put a lot of effort into reducing mercury emissions, especially northern industrialized countries, and for very good reason. But 10 percent of the global anthropogenic source is substantial, and there is a potential for that to be even greater in the future. [Addressing these deforestation-related emissions] needs to be part of the solution,” says senior author Noelle Selin, a professor in IDSS and MIT’s Department of Earth, Atmospheric and Planetary Sciences.

    Feinberg and Selin are joined on the paper by co-authors Martin Jiskra, a former Swiss National Science Foundation Ambizione Fellow at the University of Basel; Pasquale Borrelli, a professor at Roma Tre University in Italy; and Jagannath Biswakarma, a postdoc at the Swiss Federal Institute of Aquatic Science and Technology. The paper appears today in Environmental Science and Technology.

    Modeling mercury

    Over the past few decades, scientists have generally focused on studying deforestation as a source of global carbon dioxide emissions. Mercury, a trace element, hasn’t received the same attention, partly because the terrestrial biosphere’s role in the global mercury cycle has only recently been better quantified.

    Plant leaves take up mercury from the atmosphere, in a similar way as they take up carbon dioxide. But unlike carbon dioxide, mercury doesn’t play an essential biological function for plants. Mercury largely stays within a leaf until it falls to the forest floor, where the mercury is absorbed by the soil.

    Mercury becomes a serious concern for humans if it ends up in water bodies, where it can become methylated by microorganisms. Methylmercury, a potent neurotoxin, can be taken up by fish and bioaccumulated through the food chain. This can lead to risky levels of methylmercury in the fish humans eat.

    “In soils, mercury is much more tightly bound than it would be if it were deposited in the ocean. The forests are doing a sort of ecosystem service, in that they are sequestering mercury for longer timescales,” says Feinberg, who is now a postdoc in the Blas Cabrera Institute of Physical Chemistry in Spain.

    In this way, forests reduce the amount of toxic methylmercury in oceans.

    Many studies of mercury focus on industrial sources, like burning fossil fuels, small-scale gold mining, and metal smelting. A global treaty, the 2013 Minamata Convention, calls on nations to reduce human-made emissions. However, it doesn’t directly consider impacts of deforestation.

    The researchers launched their study to fill in that missing piece.

    In past work, they had built a model to probe the role vegetation plays in mercury uptake. Using a series of land use change scenarios, they adjusted the model to quantify the role of deforestation.

    Evaluating emissions

    This chemical transport model tracks mercury from its emissions sources to where it is chemically transformed in the atmosphere and then ultimately to where it is deposited, mainly through rainfall or uptake into forest ecosystems.

    They divided the Earth into eight regions and performed simulations to calculate deforestation emissions factors for each, considering elements like type and density of vegetation, mercury content in soils, and historical land use.

    However, good data for some regions were hard to come by.

    They lacked measurements from tropical Africa or Southeast Asia — two areas that experience heavy deforestation. To get around this gap, they used simpler, offline models to simulate hundreds of scenarios, which helped them improve their estimations of potential uncertainties.

    They also developed a new formulation for mercury emissions from soil. This formulation captures the fact that deforestation reduces leaf area, which increases the amount of sunlight that hits the ground and accelerates the outgassing of mercury from soils.

    The model divides the world into grid squares, each of which is a few hundred square kilometers. By changing land surface and vegetation parameters in certain squares to represent deforestation and reforestation scenarios, the researchers can capture impacts on the mercury cycle.

    Overall, they found that about 200 tons of mercury are emitted to the atmosphere as the result of deforestation, or about 10 percent of total human-made emissions. But in tropical and sub-tropical countries, deforestation emissions represent a higher percentage of total emissions. For example, in Brazil deforestation emissions are 40 percent of total human-made emissions.

    In addition, people often light fires to prepare tropical forested areas for agricultural activities, which causes more emissions by releasing mercury stored by vegetation.

    “If deforestation was a country, it would be the second highest emitting country, after China, which emits around 500 tons of mercury a year,” Feinberg adds.

    And since the Minamata Convention is now addressing primary mercury emissions, scientists can expect deforestation to become a larger fraction of human-made emissions in the future.

    “Policies to protect forests or cut them down have unintended effects beyond their target. It is important to consider the fact that these are systems, and they involve human activities, and we need to understand them better in order to actually solve the problems that we know are out there,” Selin says.

    By providing this first estimate, the team hopes to inspire more research in this area.

    In the future, they want to incorporate more dynamic Earth system models into their analysis, which would enable them to interactively track mercury uptake and better model the timescale of vegetation regrowth.

    “This paper represents an important advance in our understanding of global mercury cycling by quantifying a pathway that has long been suggested but not yet quantified. Much of our research to date has focused on primary anthropogenic emissions — those directly resulting from human activity via coal combustion or mercury-gold amalgam burning in artisanal and small-scale gold mining,” says Jackie Gerson, an assistant professor in the Department of Earth and Environmental Sciences at Michigan State University, who was not involved with this research. “This research shows that deforestation can also result in substantial mercury emissions and needs to be considered both in terms of global mercury models and land management policies. It therefore has the potential to advance our field scientifically as well as to promote policies that reduce mercury emissions via deforestation.

    This work was funded, in part, by the U.S. National Science Foundation, the Swiss National Science Foundation, and Swiss Federal Institute of Aquatic Science and Technology. More