More stories

  • in

    Taming the data deluge

    An oncoming tsunami of data threatens to overwhelm huge data-rich research projects on such areas that range from the tiny neutrino to an exploding supernova, as well as the mysteries deep within the brain. 

    When LIGO picks up a gravitational-wave signal from a distant collision of black holes and neutron stars, a clock starts ticking for capturing the earliest possible light that may accompany them: time is of the essence in this race. Data collected from electrical sensors monitoring brain activity are outpacing computing capacity. Information from the Large Hadron Collider (LHC)’s smashed particle beams will soon exceed 1 petabit per second. 

    To tackle this approaching data bottleneck in real-time, a team of researchers from nine institutions led by the University of Washington, including MIT, has received $15 million in funding to establish the Accelerated AI Algorithms for Data-Driven Discovery (A3D3) Institute. From MIT, the research team includes Philip Harris, assistant professor of physics, who will serve as the deputy director of the A3D3 Institute; Song Han, assistant professor of electrical engineering and computer science, who will serve as the A3D3’s co-PI; and Erik Katsavounidis, senior research scientist with the MIT Kavli Institute for Astrophysics and Space Research.

    Infused with this five-year Harnessing the Data Revolution Big Idea grant, and jointly funded by the Office of Advanced Cyberinfrastructure, A3D3 will focus on three data-rich fields: multi-messenger astrophysics, high-energy particle physics, and brain imaging neuroscience. By enriching AI algorithms with new processors, A3D3 seeks to speed up AI algorithms for solving fundamental problems in collider physics, neutrino physics, astronomy, gravitational-wave physics, computer science, and neuroscience. 

    “I am very excited about the new Institute’s opportunities for research in nuclear and particle physics,” says Laboratory for Nuclear Science Director Boleslaw Wyslouch. “Modern particle detectors produce an enormous amount of data, and we are looking for extraordinarily rare signatures. The application of extremely fast processors to sift through these mountains of data will make a huge difference in what we will measure and discover.”

    The seeds of A3D3 were planted in 2017, when Harris and his colleagues at Fermilab and CERN decided to integrate real-time AI algorithms to process the incredible rates of data at the LHC. Through email correspondence with Han, Harris’ team built a compiler, HLS4ML, that could run an AI algorithm in nanoseconds.

    “Before the development of HLS4ML, the fastest processing that we knew of was roughly a millisecond per AI inference, maybe a little faster,” says Harris. “We realized all the AI algorithms were designed to solve much slower problems, such as image and voice recognition. To get to nanosecond inference timescales, we recognized we could make smaller algorithms and rely on custom implementations with Field Programmable Gate Array (FPGA) processors in an approach that was largely different from what others were doing.”

    A few months later, Harris presented their research at a physics faculty meeting, where Katsavounidis became intrigued. Over coffee in Building 7, they discussed combining Harris’ FPGA with Katsavounidis’s use of machine learning for finding gravitational waves. FPGAs and other new processor types, such as graphics processing units (GPUs), accelerate AI algorithms to more quickly analyze huge amounts of data.

    “I had worked with the first FPGAs that were out in the market in the early ’90s and have witnessed first-hand how they revolutionized front-end electronics and data acquisition in big high-energy physics experiments I was working on back then,” recalls Katsavounidis. “The ability to have them crunch gravitational-wave data has been in the back of my mind since joining LIGO over 20 years ago.”

    Two years ago they received their first grant, and the University of Washington’s Shih-Chieh Hsu joined in. The team initiated the Fast Machine Lab, published about 40 papers on the subject, built the group to about 50 researchers, and “launched a whole industry of how to explore a region of AI that has not been explored in the past,” says Harris. “We basically started this without any funding. We’ve been getting small grants for various projects over the years. A3D3 represents our first large grant to support this effort.”  

    “What makes A3D3 so special and suited to MIT is its exploration of a technical frontier, where AI is implemented not in high-level software, but rather in lower-level firmware, reconfiguring individual gates to address the scientific question at hand,” says Rob Simcoe, director of MIT Kavli Institute for Astrophysics and Space Research and the Francis Friedman Professor of Physics. “We are in an era where experiments generate torrents of data. The acceleration gained from tailoring reprogrammable, bespoke computers at the processor level can advance real-time analysis of these data to new levels of speed and sophistication.”

    The Huge Data from the Large Hadron Collider 

    With data rates already exceeding 500 terabits per second, the LHC processes more data than any other scientific instrument on earth. Its future aggregate data rates will soon exceed 1 petabit per second, the biggest data rate in the world. 

    “Through the use of AI, A3D3 aims to perform advanced analyses, such as anomaly detection, and particle reconstruction on all collisions happening 40 million times per second,” says Harris.

    The goal is to find within all of this data a way to identify the few collisions out of the 3.2 billion collisions per second that could reveal new forces, explain how dark matter is formed, and complete the picture of how fundamental forces interact with matter. Processing all of this information requires a customized computing system capable of interpreting the collider information within ultra-low latencies.  

    “The challenge of running this on all of the 100s of terabits per second in real-time is daunting and requires a complete overhaul of how we design and implement AI algorithms,” says Harris. “With large increases in the detector resolution leading to data rates that are even larger the challenge of finding the one collision, among many, will become even more daunting.” 

    The Brain and the Universe

    Thanks to advances in techniques such as medical imaging and electrical recordings from implanted electrodes, neuroscience is also gathering larger amounts of data on how the brain’s neural networks process responses to stimuli and perform motor information. A3D3 plans to develop and implement high-throughput and low-latency AI algorithms to process, organize, and analyze massive neural datasets in real time, to probe brain function in order to enable new experiments and therapies.   

    With Multi-Messenger Astrophysics (MMA), A3D3 aims to quickly identify astronomical events by efficiently processing data from gravitational waves, gamma-ray bursts, and neutrinos picked up by telescopes and detectors. 

    The A3D3 researchers also include a multi-disciplinary group of 15 other researchers, including project lead the University of Washington, along with Caltech, Duke University, Purdue University, UC San Diego, University of Illinois Urbana-Champaign, University of Minnesota, and the University of Wisconsin-Madison. It will include neutrinos research at Icecube and DUNE, and visible astronomy at Zwicky Transient Facility, and will organize deep-learning workshops and boot camps to train students and researchers on how to contribute to the framework and widen the use of fast AI strategies.

    “We have reached a point where detector network growth will be transformative, both in terms of event rates and in terms of astrophysical reach and ultimately, discoveries,” says Katsavounidis. “‘Fast’ and ‘efficient’ is the only way to fight the ‘faint’ and ‘fuzzy’ that is out there in the universe, and the path for getting the most out of our detectors. A3D3 on one hand is going to bring production-scale AI to gravitational-wave physics and multi-messenger astronomy; but on the other hand, we aspire to go beyond our immediate domains and become the go-to place across the country for applications of accelerated AI to data-driven disciplines.” More

  • in

    Making machine learning more useful to high-stakes decision makers

    The U.S. Centers for Disease Control and Prevention estimates that one in seven children in the United States experienced abuse or neglect in the past year. Child protective services agencies around the nation receive a high number of reports each year (about 4.4 million in 2019) of alleged neglect or abuse. With so many cases, some agencies are implementing machine learning models to help child welfare specialists screen cases and determine which to recommend for further investigation.

    But these models don’t do any good if the humans they are intended to help don’t understand or trust their outputs.

    Researchers at MIT and elsewhere launched a research project to identify and tackle machine learning usability challenges in child welfare screening. In collaboration with a child welfare department in Colorado, the researchers studied how call screeners assess cases, with and without the help of machine learning predictions. Based on feedback from the call screeners, they designed a visual analytics tool that uses bar graphs to show how specific factors of a case contribute to the predicted risk that a child will be removed from their home within two years.

    The researchers found that screeners are more interested in seeing how each factor, like the child’s age, influences a prediction, rather than understanding the computational basis of how the model works. Their results also show that even a simple model can cause confusion if its features are not described with straightforward language.

    These findings could be applied to other high-risk fields where humans use machine learning models to help them make decisions, but lack data science experience, says senior author Kalyan Veeramachaneni, principal research scientist in the Laboratory for Information and Decision Systems (LIDS) and senior author of the paper.

    “Researchers who study explainable AI, they often try to dig deeper into the model itself to explain what the model did. But a big takeaway from this project is that these domain experts don’t necessarily want to learn what machine learning actually does. They are more interested in understanding why the model is making a different prediction than what their intuition is saying, or what factors it is using to make this prediction. They want information that helps them reconcile their agreements or disagreements with the model, or confirms their intuition,” he says.

    Co-authors include electrical engineering and computer science PhD student Alexandra Zytek, who is the lead author; postdoc Dongyu Liu; and Rhema Vaithianathan, professor of economics and director of the Center for Social Data Analytics at the Auckland University of Technology and professor of social data analytics at the University of Queensland. The research will be presented later this month at the IEEE Visualization Conference.

    Real-world research

    The researchers began the study more than two years ago by identifying seven factors that make a machine learning model less usable, including lack of trust in where predictions come from and disagreements between user opinions and the model’s output.

    With these factors in mind, Zytek and Liu flew to Colorado in the winter of 2019 to learn firsthand from call screeners in a child welfare department. This department is implementing a machine learning system developed by Vaithianathan that generates a risk score for each report, predicting the likelihood the child will be removed from their home. That risk score is based on more than 100 demographic and historic factors, such as the parents’ ages and past court involvements.

    “As you can imagine, just getting a number between one and 20 and being told to integrate this into your workflow can be a bit challenging,” Zytek says.

    They observed how teams of screeners process cases in about 10 minutes and spend most of that time discussing the risk factors associated with the case. That inspired the researchers to develop a case-specific details interface, which shows how each factor influenced the overall risk score using color-coded, horizontal bar graphs that indicate the magnitude of the contribution in a positive or negative direction.

    Based on observations and detailed interviews, the researchers built four additional interfaces that provide explanations of the model, including one that compares a current case to past cases with similar risk scores. Then they ran a series of user studies.

    The studies revealed that more than 90 percent of the screeners found the case-specific details interface to be useful, and it generally increased their trust in the model’s predictions. On the other hand, the screeners did not like the case comparison interface. While the researchers thought this interface would increase trust in the model, screeners were concerned it could lead to decisions based on past cases rather than the current report.   

    “The most interesting result to me was that, the features we showed them — the information that the model uses — had to be really interpretable to start. The model uses more than 100 different features in order to make its prediction, and a lot of those were a bit confusing,” Zytek says.

    Keeping the screeners in the loop throughout the iterative process helped the researchers make decisions about what elements to include in the machine learning explanation tool, called Sibyl.

    As they refined the Sibyl interfaces, the researchers were careful to consider how providing explanations could contribute to some cognitive biases, and even undermine screeners’ trust in the model.

    For instance, since explanations are based on averages in a database of child abuse and neglect cases, having three past abuse referrals may actually decrease the risk score of a child, since averages in this database may be far higher. A screener may see that explanation and decide not to trust the model, even though it is working correctly, Zytek explains. And because humans tend to put more emphasis on recent information, the order in which the factors are listed could also influence decisions.

    Improving interpretability

    Based on feedback from call screeners, the researchers are working to tweak the explanation model so the features that it uses are easier to explain.

    Moving forward, they plan to enhance the interfaces they’ve created based on additional feedback and then run a quantitative user study to track the effects on decision making with real cases. Once those evaluations are complete, they can prepare to deploy Sibyl, Zytek says.

    “It was especially valuable to be able to work so actively with these screeners. We got to really understand the problems they faced. While we saw some reservations on their part, what we saw more of was excitement about how useful these explanations were in certain cases. That was really rewarding,” she says.

    This work is supported, in part, by the National Science Foundation. More

  • in

    One autonomous taxi, please

    If you don’t get seasick, an autonomous boat might be the right mode of transportation for you. 

    Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Senseable City Laboratory, together with Amsterdam Institute for Advanced Metropolitan Solutions (AMS Institute) in the Netherlands, have now created the final project in their self-navigating trilogy: a full-scale, fully autonomous robotic boat that’s ready to be deployed along the canals of Amsterdam. 

    “Roboat” has come a long way since the team first started prototyping small vessels in the MIT pool in late 2015. Last year, the team released their half-scale, medium model that was 2 meters long and demonstrated promising navigational prowess. 

    This year, two full-scale Roboats were launched, proving more than just proof-of-concept: these craft can comfortably carry up to five people, collect waste, deliver goods, and provide on-demand infrastructure. 

    The boat looks futuristic — it’s a sleek combination of black and gray with two seats that face each other, with orange block letters on the sides that illustrate the makers’ namesakes. It’s a fully electrical boat with a battery that’s the size of a small chest, enabling up to 10 hours of operation and wireless charging capabilities. 

    Play video

    Autonomous Roboats set sea in the Amsterdam canals and can comfortably carry up to five people, collect waste, deliver goods, and provide on-demand infrastructure.

    “We now have higher precision and robustness in the perception, navigation, and control systems, including new functions, such as close-proximity approach mode for latching capabilities, and improved dynamic positioning, so the boat can navigate real-world waters,” says Daniela Rus, MIT professor of electrical engineering and computer science and director of CSAIL. “Roboat’s control system is adaptive to the number of people in the boat.” 

    To swiftly navigate the bustling waters of Amsterdam, Roboat needs a meticulous fusion of proper navigation, perception, and control software. 

    Using GPS, the boat autonomously decides on a safe route from A to B, while continuously scanning the environment to  avoid collisions with objects, such as bridges, pillars, and other boats.

    To autonomously determine a free path and avoid crashing into objects, Roboat uses lidar and a number of cameras to enable a 360-degree view. This bundle of sensors is referred to as the “perception kit” and lets Roboat understand its surroundings. When the perception picks up an unseen object, like a canoe, for example, the algorithm flags the item as “unknown.” When the team later looks at the collected data from the day, the object is manually selected and can be tagged as “canoe.” 

    The control algorithms — similar to ones used for self-driving cars — function a little like a coxswain giving orders to rowers, by translating a given path into instructions toward the “thrusters,” which are the propellers that help the boat move.  

    If you think the boat feels slightly futuristic, its latching mechanism is one of its most impressive feats: small cameras on the boat guide it to the docking station, or other boats, when they detect specific QR codes. “The system allows Roboat to connect to other boats, and to the docking station, to form temporary bridges to alleviate traffic, as well as floating stages and squares, which wasn’t possible with the last iteration,” says Carlo Ratti, professor of the practice in the MIT Department of Urban Studies and Planning (DUSP) and director of the Senseable City Lab. 

    Roboat, by design, is also versatile. The team created a universal “hull” design — that’s the part of the boat that rides both in and on top of the water. While regular boats have unique hulls, designed for specific purposes, Roboat has a universal hull design where the base is the same, but the top decks can be switched out depending on the use case.

    “As Roboat can perform its tasks 24/7, and without a skipper on board, it adds great value for a city. However, for safety reasons it is questionable if reaching level A autonomy is desirable,” says Fabio Duarte, a principal research scientist in DUSP and lead scientist on the project. “Just like a bridge keeper, an onshore operator will monitor Roboat remotely from a control center. One operator can monitor over 50 Roboat units, ensuring smooth operations.”

    The next step for Roboat is to pilot the technology in the public domain. “The historic center of Amsterdam is the perfect place to start, with its capillary network of canals suffering from contemporary challenges, such as mobility and logistics,” says Stephan van Dijk, director of innovation at AMS Institute. 

    Previous iterations of Roboat have been presented at the IEEE International Conference on Robotics and Automation. The boats will be unveiled on Oct. 28 in the waters of Amsterdam. 

    Ratti, Rus, Duarte, and Dijk worked on the project alongside Andrew Whittle, MIT’s Edmund K Turner Professor in civil and environmental engineering; Dennis Frenchman, professor at MIT’s Department of Urban Studies and Planning; and Ynse Deinema of AMS Institute. The full team can be found at Roboat’s website. The project is a joint collaboration with AMS Institute. The City of Amsterdam is a project partner. More

  • in

    Deep learning helps predict traffic crashes before they happen

    Today’s world is one big maze, connected by layers of concrete and asphalt that afford us the luxury of navigation by vehicle. For many of our road-related advancements — GPS lets us fire fewer neurons thanks to map apps, cameras alert us to potentially costly scrapes and scratches, and electric autonomous cars have lower fuel costs — our safety measures haven’t quite caught up. We still rely on a steady diet of traffic signals, trust, and the steel surrounding us to safely get from point A to point B. 

    To get ahead of the uncertainty inherent to crashes, scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Qatar Center for Artificial Intelligence developed a deep learning model that predicts very high-resolution crash risk maps. Fed on a combination of historical crash data, road maps, satellite imagery, and GPS traces, the risk maps describe the expected number of crashes over a period of time in the future, to identify high-risk areas and predict future crashes. 

    Typically, these types of risk maps are captured at much lower resolutions that hover around hundreds of meters, which means glossing over crucial details since the roads become blurred together. These maps, though, are 5×5 meter grid cells, and the higher resolution brings newfound clarity: The scientists found that a highway road, for example, has a higher risk than nearby residential roads, and ramps merging and exiting the highway have an even higher risk than other roads. 

    “By capturing the underlying risk distribution that determines the probability of future crashes at all places, and without any historical data, we can find safer routes, enable auto insurance companies to provide customized insurance plans based on driving trajectories of customers, help city planners design safer roads, and even predict future crashes,” says MIT CSAIL PhD student Songtao He, a lead author on a new paper about the research. 

    Even though car crashes are sparse, they cost about 3 percent of the world’s GDP and are the leading cause of death in children and young adults. This sparsity makes inferring maps at such a high resolution a tricky task. Crashes at this level are thinly scattered — the average annual odds of a crash in a 5×5 grid cell is about one-in-1,000 — and they rarely happen at the same location twice. Previous attempts to predict crash risk have been largely “historical,” as an area would only be considered high-risk if there was a previous nearby crash. 

    The team’s approach casts a wider net to capture critical data. It identifies high-risk locations using GPS trajectory patterns, which give information about density, speed, and direction of traffic, and satellite imagery that describes road structures, such as the number of lanes, whether there’s a shoulder, or if there’s a large number of pedestrians. Then, even if a high-risk area has no recorded crashes, it can still be identified as high-risk, based on its traffic patterns and topology alone. 

    To evaluate the model, the scientists used crashes and data from 2017 and 2018, and tested its performance at predicting crashes in 2019 and 2020. Many locations were identified as high-risk, even though they had no recorded crashes, and also experienced crashes during the follow-up years.

    “Our model can generalize from one city to another by combining multiple clues from seemingly unrelated data sources. This is a step toward general AI, because our model can predict crash maps in uncharted territories,” says Amin Sadeghi, a lead scientist at Qatar Computing Research Institute (QCRI) and an author on the paper. “The model can be used to infer a useful crash map even in the absence of historical crash data, which could translate to positive use for city planning and policymaking by comparing imaginary scenarios.” 

    The dataset covered 7,500 square kilometers from Los Angeles, New York City, Chicago and Boston. Among the four cities, L.A. was the most unsafe, since it had the highest crash density, followed by New York City, Chicago, and Boston. 

    “If people can use the risk map to identify potentially high-risk road segments, they can take action in advance to reduce the risk of trips they take. Apps like Waze and Apple Maps have incident feature tools, but we’re trying to get ahead of the crashes — before they happen,” says He. 

    He and Sadeghi wrote the paper alongside Sanjay Chawla, research director at QCRI, and MIT professors of electrical engineering and computer science Mohammad Alizadeh, ​​Hari Balakrishnan, and Sam Madden. They will present the paper at the 2021 International Conference on Computer Vision. More

  • in

    Making data visualizations more accessible

    In the early days of the Covid-19 pandemic, the Centers for Disease Control and Prevention produced a simple chart to illustrate how measures like mask wearing and social distancing could “flatten the curve” and reduce the peak of infections.

    The chart was amplified by news sites and shared on social media platforms, but it often lacked a corresponding text description to make it accessible for blind individuals who use a screen reader to navigate the web, shutting out many of the 253 million people worldwide who have visual disabilities.

    This alternative text is often missing from online charts, and even when it is included, it is frequently uninformative or even incorrect, according to qualitative data gathered by scientists at MIT.

    These researchers conducted a study with blind and sighted readers to determine which text is useful to include in a chart description, which text is not, and why. Ultimately, they found that captions for blind readers should focus on the overall trends and statistics in the chart, not its design elements or higher-level insights.

    They also created a conceptual model that can be used to evaluate a chart description, whether the text was generated automatically by software or manually by a human author. Their work could help journalists, academics, and communicators create descriptions that are more effective for blind individuals and guide researchers as they develop better tools to automatically generate captions.

    “Ninety-nine-point-nine percent of images on Twitter lack any kind of description — and that is not hyperbole, that is the actual statistic,” says Alan Lundgard, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper. “Having people manually author those descriptions seems to be difficult for a variety of reasons. Perhaps semiautonomous tools could help with that. But it is crucial to do this preliminary participatory design work to figure out what is the target for these tools, so we are not generating content that is either not useful to its intended audience or, in the worst case, erroneous.”

    Lundgard wrote the paper with senior author Arvind Satyanarayan, an assistant professor of computer science who leads the Visualization Group in CSAIL. The research will be presented at the Institute of Electrical and Electronics Engineers Visualization Conference in October.

    Evaluating visualizations

    To develop the conceptual model, the researchers planned to begin by studying graphs featured by popular online publications such as FiveThirtyEight and NYTimes.com, but they ran into a problem — those charts mostly lacked any textual descriptions. So instead, they collected descriptions for these charts from graduate students in an MIT data visualization class and through an online survey, then grouped the captions into four categories.

    Level 1 descriptions focus on the elements of the chart, such as its title, legend, and colors. Level 2 descriptions describe statistical content, like the minimum, maximum, or correlations. Level 3 descriptions cover perceptual interpretations of the data, like complex trends or clusters. Level 4 descriptions include subjective interpretations that go beyond the data and draw on the author’s knowledge.

    In a study with blind and sighted readers, the researchers presented visualizations with descriptions at different levels and asked participants to rate how useful they were. While both groups agreed that level 1 content on its own was not very helpful, sighted readers gave level 4 content the highest marks while blind readers ranked that content among the least useful.

    Survey results revealed that a majority of blind readers were emphatic that descriptions should not contain an author’s editorialization, but rather stick to straight facts about the data. On the other hand, most sighted readers preferred a description that told a story about the data.

    “For me, a surprising finding about the lack of utility for the highest-level content is that it ties very closely to feelings about agency and control as a disabled person. In our research, blind readers specifically didn’t want the descriptions to tell them what to think about the data. They want the data to be accessible in a way that allows them to interpret it for themselves, and they want to have the agency to do that interpretation,” Lundgard says.

    A more inclusive future

    This work could have implications as data scientists continue to develop and refine machine learning methods for autogenerating captions and alternative text.

    “We are not able to do it yet, but it is not inconceivable to imagine that in the future we would be able to automate the creation of some of this higher-level content and build models that target level 2 or level 3 in our framework. And now we know what the research questions are. If we want to produce these automated captions, what should those captions say? We are able to be a bit more directed in our future research because we have these four levels,” Satyanarayan says.

    In the future, the four-level framework could also help researchers develop machine learning models that can automatically suggest effective visualizations as part of the data analysis process, or models that can extract the most useful information from a chart.

    This research could also inform future work in Satyanarayan’s group that seeks to make interactive visualizations more accessible for blind readers who use a screen reader to access and interpret the information. 

    “The question of how to ensure that charts and graphs are accessible to screen reader users is both a socially important equity issue and a challenge that can advance the state-of-the-art in AI,” says Meredith Ringel Morris, director and principal scientist of the People + AI Research team at Google Research, who was not involved with this study. “By introducing a framework for conceptualizing natural language descriptions of information graphics that is grounded in end-user needs, this work helps ensure that future AI researchers will focus their efforts on problems aligned with end-users’ values.”

    Morris adds: “Rich natural-language descriptions of data graphics will not only expand access to critical information for people who are blind, but will also benefit a much wider audience as eyes-free interactions via smart speakers, chatbots, and other AI-powered agents become increasingly commonplace.”

    This research was supported by the National Science Foundation. More

  • in

    Enabling AI-driven health advances without sacrificing patient privacy

    There’s a lot of excitement at the intersection of artificial intelligence and health care. AI has already been used to improve disease treatment and detection, discover promising new drugs, identify links between genes and diseases, and more.

    By analyzing large datasets and finding patterns, virtually any new algorithm has the potential to help patients — AI researchers just need access to the right data to train and test those algorithms. Hospitals, understandably, are hesitant to share sensitive patient information with research teams. When they do share data, it’s difficult to verify that researchers are only using the data they need and deleting it after they’re done.

    Secure AI Labs (SAIL) is addressing those problems with a technology that lets AI algorithms run on encrypted datasets that never leave the data owner’s system. Health care organizations can control how their datasets are used, while researchers can protect the confidentiality of their models and search queries. Neither party needs to see the data or the model to collaborate.

    SAIL’s platform can also combine data from multiple sources, creating rich insights that fuel more effective algorithms.

    “You shouldn’t have to schmooze with hospital executives for five years before you can run your machine learning algorithm,” says SAIL co-founder and MIT Professor Manolis Kellis, who co-founded the company with CEO Anne Kim ’16, SM ’17. “Our goal is to help patients, to help machine learning scientists, and to create new therapeutics. We want new algorithms — the best algorithms — to be applied to the biggest possible data set.”

    SAIL has already partnered with hospitals and life science companies to unlock anonymized data for researchers. In the next year, the company hopes to be working with about half of the top 50 academic medical centers in the country.

    Unleashing AI’s full potential

    As an undergraduate at MIT studying computer science and molecular biology, Kim worked with researchers in the Computer Science and Artificial Intelligence Laboratory (CSAIL) to analyze data from clinical trials, gene association studies, hospital intensive care units, and more.

    “I realized there is something severely broken in data sharing, whether it was hospitals using hard drives, ancient file transfer protocol, or even sending stuff in the mail,” Kim says. “It was all just not well-tracked.”

    Kellis, who is also a member of the Broad Institute of MIT and Harvard, has spent years establishing partnerships with hospitals and consortia across a range of diseases including cancers, heart disease, schizophrenia, and obesity. He knew that smaller research teams would struggle to get access to the same data his lab was working with.

    In 2017, Kellis and Kim decided to commercialize technology they were developing to allow AI algorithms to run on encrypted data.

    In the summer of 2018, Kim participated in the delta v startup accelerator run by the Martin Trust Center for MIT Entrepreneurship. The founders also received support from the Sandbox Innovation Fund and the Venture Mentoring Service, and made various early connections through their MIT network.

    To participate in SAIL’s program, hospitals and other health care organizations make parts of their data available to researchers by setting up a node behind their firewall. SAIL then sends encrypted algorithms to the servers where the datasets reside in a process called federated learning. The algorithms crunch the data locally in each server and transmit the results back to a central model, which updates itself. No one — not the researchers, the data owners, or even SAIL —has access to the models or the datasets.

    The approach allows a much broader set of researchers to apply their models to large datasets. To further engage the research community, Kellis’ lab at MIT has begun holding competitions in which it gives access to datasets in areas like protein function and gene expression, and challenges researchers to predict results.

    “We invite machine learning researchers to come and train on last year’s data and predict this year’s data,” says Kellis. “If we see there’s a new type of algorithm that is performing best in these community-level assessments, people can adopt it locally at many different institutions and level the playing field. So, the only thing that matters is the quality of your algorithm rather than the power of your connections.”

    By enabling a large number of datasets to be anonymized into aggregate insights, SAIL’s technology also allows researchers to study rare diseases, in which small pools of relevant patient data are often spread out among many institutions. That has historically made the data difficult to apply AI models to.

    “We’re hoping that all of these datasets will eventually be open,” Kellis says. “We can cut across all the silos and enable a new era where every patient with every rare disorder across the entire world can come together in a single keystroke to analyze data.”

    Enabling the medicine of the future

    To work with large amounts of data around specific diseases, SAIL has increasingly sought to partner with patient associations and consortia of health care groups, including an international health care consulting company and the Kidney Cancer Association. The partnerships also align SAIL with patients, the group they’re most trying to help.

    Overall, the founders are happy to see SAIL solving problems they faced in their labs for researchers around the world.

    “The right place to solve this is not an academic project. The right place to solve this is in industry, where we can provide a platform not just for my lab but for any researcher,” Kellis says. “It’s about creating an ecosystem of academia, researchers, pharma, biotech, and hospital partners. I think it’s the blending all of these different areas that will make that vision of medicine of the future become a reality.” More

  • in

    3 Questions: Kalyan Veeramachaneni on hurdles preventing fully automated machine learning

    The proliferation of big data across domains, from banking to health care to environmental monitoring, has spurred increasing demand for machine learning tools that help organizations make decisions based on the data they gather.

    That growing industry demand has driven researchers to explore the possibilities of automated machine learning (AutoML), which seeks to automate the development of machine learning solutions in order to make them accessible for nonexperts, improve their efficiency, and accelerate machine learning research. For example, an AutoML system might enable doctors to use their expertise interpreting electroencephalography (EEG) results to build a model that can predict which patients are at higher risk for epilepsy — without requiring the doctors to have a background in data science.

    Yet, despite more than a decade of work, researchers have been unable to fully automate all steps in the machine learning development process. Even the most efficient commercial AutoML systems still require a prolonged back-and-forth between a domain expert, like a marketing manager or mechanical engineer, and a data scientist, making the process inefficient.

    Kalyan Veeramachaneni, a principal research scientist in the MIT Laboratory for Information and Decision Systems who has been studying AutoML since 2010, has co-authored a paper in the journal ACM Computing Surveys that details a seven-tiered schematic to evaluate AutoML tools based on their level of autonomy.

    A system at level zero has no automation and requires a data scientist to start from scratch and build models by hand, while a tool at level six is completely automated and can be easily and effectively used by a nonexpert. Most commercial systems fall somewhere in the middle.

    Veeramachaneni spoke with MIT News about the current state of AutoML, the hurdles that prevent truly automatic machine learning systems, and the road ahead for AutoML researchers.

    Q: How has automatic machine learning evolved over the past decade, and what is the current state of AutoML systems?

    A: In 2010, we started to see a shift, with enterprises wanting to invest in getting value out of their data beyond just business intelligence. So then came the question, maybe there are certain things in the development of machine learning-based solutions that we can automate? The first iteration of AutoML was to make our own jobs as data scientists more efficient. Can we take away the grunt work that we do on a day-to-day basis and automate that by using a software system? That area of research ran its course until about 2015, when we realized we still weren’t able to speed up this development process.

    Then another thread emerged. There are a lot of problems that could be solved with data, and they come from experts who know those problems, who live with them on a daily basis. These individuals have very little to do with machine learning or software engineering. How do we bring them into the fold? That is really the next frontier.

    There are three areas where these domain experts have strong input in a machine learning system. The first is defining the problem itself and then helping to formulate it as a prediction task to be solved by a machine learning model. Second, they know how the data have been collected, so they also know intuitively how to process that data. And then third, at the end, machine learning models only give you a very tiny part of a solution — they just give you a prediction. The output of a machine learning model is just one input to help a domain expert get to a decision or action.

    Q: What steps of the machine learning pipeline are the most difficult to automate, and why has automating them been so challenging?

    A: The problem-formulation part is extremely difficult to automate. For example, if I am a researcher who wants to get more government funding, and I have a lot of data about the content of the research proposals that I write and whether or not I receive funding, can machine learning help there? We don’t know yet. In problem formulation, I use my domain expertise to translate the problem into something that is more tangible to predict, and that requires somebody who knows the domain very well. And he or she also knows how to use that information post-prediction. That problem is refusing to be automated.

    There is one part of problem-formulation that could be automated. It turns out that we can look at the data and mathematically express several possible prediction tasks automatically. Then we can share those prediction tasks with the domain expert to see if any of them would help in the larger problem they are trying to tackle. Then once you pick the prediction task, there are a lot of intermediate steps you do, including feature engineering, modeling, etc., that are very mechanical steps and easy to automate.

    But defining the prediction tasks has typically been a collaborative effort between data scientists and domain experts because, unless you know the domain, you can’t translate the domain problem into a prediction task. And then sometimes domain experts don’t know what is meant by “prediction.” That leads to the major, significant back and forth in the process. If you automate that step, then machine learning penetration and the use of data to create meaningful predictions will increase tremendously.

    Then what happens after the machine learning model gives a prediction? We can automate the software and technology part of it, but at the end of the day, it is root cause analysis and human intuition and decision making. We can augment them with a lot of tools, but we can’t fully automate that.

    Q: What do you hope to achieve with the seven-tiered framework for evaluating AutoML systems that you outlined in your paper?

    A: My hope is that people start to recognize that some levels of automation have already been achieved and some still need to be tackled. In the research community, we tend to focus on what we are comfortable with. We have gotten used to automating certain steps, and then we just stick to it. Automating these other parts of the machine learning solution development is very important, and that is where the biggest bottlenecks remain.

    My second hope is that researchers will very clearly understand what domain expertise means. A lot of this AutoML work is still being conducted by academics, and the problem is that we often don’t do applied work. There is not a crystal-clear definition of what a domain expert is and in itself, “domain expert,” is a very nebulous phrase. What we mean by domain expert is the expert in the problem you are trying to solve with machine learning. And I am hoping that everyone unifies around that because that would make things so much clearer.

    I still believe that we are not able to build that many models for that many problems, but even for the ones that we are building, the majority of them are not getting deployed and used in day-to-day life. The output of machine learning is just going to be another data point, an augmented data point, in someone’s decision making. How they make those decisions, based on that input, how that will change their behavior, and how they will adapt their style of working, that is still a big, open question. Once we automate everything, that is what’s next.

    We have to determine what has to fundamentally change in the day-to-day workflow of someone giving loans at a bank, or an educator trying to decide whether he or she should change the assignments in an online class. How are they going to use machine learning’s outputs? We need to focus on the fundamental things we have to build out to make machine learning more usable. More

  • in

    Study: Global cancer risk from burning organic matter comes from unregulated chemicals

    Whenever organic matter is burned, such as in a wildfire, a power plant, a car’s exhaust, or in daily cooking, the combustion releases polycyclic aromatic hydrocarbons (PAHs) — a class of pollutants that is known to cause lung cancer.

    There are more than 100 known types of PAH compounds emitted daily into the atmosphere. Regulators, however, have historically relied on measurements of a single compound, benzo(a)pyrene, to gauge a community’s risk of developing cancer from PAH exposure. Now MIT scientists have found that benzo(a)pyrene may be a poor indicator of this type of cancer risk.

    In a modeling study appearing today in the journal GeoHealth, the team reports that benzo(a)pyrene plays a small part — about 11 percent — in the global risk of developing PAH-associated cancer. Instead, 89 percent of that cancer risk comes from other PAH compounds, many of which are not directly regulated.

    Interestingly, about 17 percent of PAH-associated cancer risk comes from “degradation products” — chemicals that are formed when emitted PAHs react in the atmosphere. Many of these degradation products can in fact be more toxic than the emitted PAH from which they formed.

    The team hopes the results will encourage scientists and regulators to look beyond benzo(a)pyrene, to consider a broader class of PAHs when assessing a community’s cancer risk.

    “Most of the regulatory science and standards for PAHs are based on benzo(a)pyrene levels. But that is a big blind spot that could lead you down a very wrong path in terms of assessing whether cancer risk is improving or not, and whether it’s relatively worse in one place than another,” says study author Noelle Selin, a professor in MIT’s Institute for Data, Systems and Society, and the Department of Earth, Atmospheric and Planetary Sciences.

    Selin’s MIT co-authors include Jesse Kroll, Amy Hrdina, Ishwar Kohale, Forest White, and Bevin Engelward, and Jamie Kelly (who is now at University College London). Peter Ivatt and Mathew Evans at the University of York are also co-authors.

    Chemical pixels

    Benzo(a)pyrene has historically been the poster chemical for PAH exposure. The compound’s indicator status is largely based on early toxicology studies. But recent research suggests the chemical may not be the PAH representative that regulators have long relied upon.   

    “There has been a bit of evidence suggesting benzo(a)pyrene may not be very important, but this was from just a few field studies,” says Kelly, a former postdoc in Selin’s group and the study’s lead author.

    Kelly and his colleagues instead took a systematic approach to evaluate benzo(a)pyrene’s suitability as a PAH indicator. The team began by using GEOS-Chem, a global, three-dimensional chemical transport model that breaks the world into individual grid boxes and simulates within each box the reactions and concentrations of chemicals in the atmosphere.

    They extended this model to include chemical descriptions of how various PAH compounds, including benzo(a)pyrene, would react in the atmosphere. The team then plugged in recent data from emissions inventories and meteorological observations, and ran the model forward to simulate the concentrations of various PAH chemicals around the world over time.

    Risky reactions

    In their simulations, the researchers started with 16 relatively well-studied PAH chemicals, including benzo(a)pyrene, and traced the concentrations of these chemicals, plus the concentration of their degradation products over two generations, or chemical transformations. In total, the team evaluated 48 PAH species.

    They then compared these concentrations with actual concentrations of the same chemicals, recorded by monitoring stations around the world. This comparison was close enough to show that the model’s concentration predictions were realistic.

    Then within each model’s grid box, the researchers related the concentration of each PAH chemical to its associated cancer risk; to do this, they had to develop a new method based on previous studies in the literature to avoid double-counting risk from the different chemicals. Finally, they overlaid population density maps to predict the number of cancer cases globally, based on the concentration and toxicity of a specific PAH chemical in each location.

    Dividing the cancer cases by population produced the cancer risk associated with that chemical. In this way, the team calculated the cancer risk for each of the 48 compounds, then determined each chemical’s individual contribution to the total risk.

    This analysis revealed that benzo(a)pyrene had a surprisingly small contribution, of about 11 percent, to the overall risk of developing cancer from PAH exposure globally. Eighty-nine percent of cancer risk came from other chemicals. And 17 percent of this risk arose from degradation products.

    “We see places where you can find concentrations of benzo(a)pyrene are lower, but the risk is higher because of these degradation products,” Selin says. “These products can be orders of magnitude more toxic, so the fact that they’re at tiny concentrations doesn’t mean you can write them off.”

    When the researchers compared calculated PAH-associated cancer risks around the world, they found significant differences depending on whether that risk calculation was based solely on concentrations of benzo(a)pyrene or on a region’s broader mix of PAH compounds.

    “If you use the old method, you would find the lifetime cancer risk is 3.5 times higher in Hong Kong versus southern India, but taking into account the differences in PAH mixtures, you get a difference of 12 times,” Kelly says. “So, there’s a big difference in the relative cancer risk between the two places. And we think it’s important to expand the group of compounds that regulators are thinking about, beyond just a single chemical.”

    The team’s study “provides an excellent contribution to better understanding these ubiquitous pollutants,” says Elisabeth Galarneau, an air quality expert and PhD research scientist in Canada’s Department of the Environment. “It will be interesting to see how these results compare to work being done elsewhere … to pin down which (compounds) need to be tracked and considered for the protection of human and environmental health.”

    This research was conducted in MIT’s Superfund Research Center and is supported in part by the National Institute of Environmental Health Sciences Superfund Basic Research Program, and the National Institutes of Health. More