More stories

  • in

    Computational modeling guides development of new materials

    Metal-organic frameworks, a class of materials with porous molecular structures, have a variety of possible applications, such as capturing harmful gases and catalyzing chemical reactions. Made of metal atoms linked by organic molecules, they can be configured in hundreds of thousands of different ways.

    To help researchers sift through all of the possible metal-organic framework (MOF) structures and help identify the ones that would be most practical for a particular application, a team of MIT computational chemists has developed a model that can analyze the features of a MOF structure and predict if it will be stable enough to be useful.

    The researchers hope that these computational predictions will help cut the development time of new MOFs.

    “This will allow researchers to test the promise of specific materials before they go through the trouble of synthesizing them,” says Heather Kulik, an associate professor of chemical engineering at MIT.

    The MIT team is now working to develop MOFs that could be used to capture methane gas and convert it to useful compounds such as fuels.

    The researchers described their new model in two papers, one in the Journal of the American Chemical Society and one in Scientific Data. Graduate students Aditya Nandy and Gianmarco Terrones are the lead authors of the Scientific Data paper, and Nandy is also the lead author of the JACS paper. Kulik is the senior author of both papers.

    Modeling structure

    MOFs consist of metal atoms joined by organic molecules called linkers to create a rigid, cage-like structure. The materials also have many pores, which makes them useful for catalyzing reactions involving gases but can also make them less structurally stable.

    “The limitation in seeing MOFs realized at industrial scale is that although we can control their properties by controlling where each atom is in the structure, they’re not necessarily that stable, as far as materials go,” Kulik says. “They’re very porous and they can degrade under realistic conditions that we need for catalysis.”

    Scientists have been working on designing MOFs for more than 20 years, and thousands of possible structures have been published. A centralized repository contains about 10,000 of these structures but is not linked to any of the published findings on the properties of those structures.

    Kulik, who specializes in using computational modeling to discover structure-property relationships of materials, wanted to take a more systematic approach to analyzing and classifying the properties of MOFs.

    “When people make these now, it’s mostly trial and error. The MOF dataset is really promising because there are so many people excited about MOFs, so there’s so much to learn from what everyone’s been working on, but at the same time, it’s very noisy and it’s not systematic the way it’s reported,” she says.

    Kulik and her colleagues set out to analyze published reports of MOF structures and properties using a natural-language-processing algorithm. Using this algorithm, they scoured nearly 4,000 published papers, extracting information on the temperature at which a given MOF would break down. They also pulled out data on whether particular MOFs can withstand the conditions needed to remove solvents used to synthesize them and make sure they become porous.

    Once the researchers had this information, they used it to train two neural networks to predict MOFs’ thermal stability and stability during solvent removal, based on the molecules’ structure.

    “Before you start working with a material and thinking about scaling it up for different applications, you want to know will it hold up, or is it going to degrade in the conditions I would want to use it in?” Kulik says. “Our goal was to get better at predicting what makes a stable MOF.”

    Better stability

    Using the model, the researchers were able to identify certain features that influence stability. In general, simpler linkers with fewer chemical groups attached to them are more stable. Pore size is also important: Before the researchers did their analysis, it had been thought that MOFs with larger pores might be too unstable. However, the MIT team found that large-pore MOFs can be stable if other aspects of their structure counteract the large pore size.

    “Since MOFs have so many things that can vary at the same time, such as the metal, the linkers, the connectivity, and the pore size, it is difficult to nail down what governs stability across different families of MOFs,” Nandy says. “Our models enable researchers to make predictions on existing or new materials, many of which have yet to be made.”

    The researchers have made their data and models available online. Scientists interested in using the models can get recommendations for strategies to make an existing MOF more stable, and they can also add their own data and feedback on the predictions of the models.

    The MIT team is now using the model to try to identify MOFs that could be used to catalyze the conversion of methane gas to methanol, which could be used as fuel. Kulik also plans to use the model to create a new dataset of hypothetical MOFs that haven’t been built before but are predicted to have high stability. Researchers could then screen this dataset for a variety of properties.

    “People are interested in MOFs for things like quantum sensing and quantum computing, all sorts of different applications where you need metals distributed in this atomically precise way,” Kulik says.

    The research was funded by DARPA, the U.S. Office of Naval Research, the U.S. Department of Energy, a National Science Foundation Graduate Research Fellowship, a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, and an AAAS Marion Milligan Mason Award. More

  • in

    MIT ReACT welcomes first Afghan cohort to its largest-yet certificate program

    Through the championing support of the faculty and leadership of the MIT Afghan Working Group convened last September by Provost Martin Schmidt and chaired by Associate Provost for International Activities Richard Lester, MIT has come together to support displaced Afghan learners and scholars in a time of crisis. The MIT Refugee Action Hub (ReACT) has opened opportunities for 25 talented Afghan learners to participate in the hub’s certificate program in computer and data science (CDS), now in its fourth year, welcoming its largest and most diverse cohort to date — 136 learners from 29 countries.

    ”Even in the face of extreme disruption, education and scholarship must continue, and MIT is committed to providing resources and safe forums for displaced scholars,” says Lester. “We greatly appreciate MIT ReACT’s work to create learning opportunities for Afghan students whose lives have been upended by the crisis in their homeland.”

    Currently, more than 3.5 million Afghans are internally displaced, while 2.5 million are registered refugees residing in other parts of the world. With millions in Afghanistan facing famine, poverty, and civil unrest in what has become the world’s largest humanitarian crisis, the United Nations predicts the number of Afghans forced to flee their homes will continue to rise. 

    “Forced displacement is on the rise, fueled not only by constant political, economical, and social turmoil worldwide, but also by the ongoing climate change crisis, which threatens costly disruptions to society and has potential to create unprecedented displacement internationally,” says associate professor of civil and environmental engineering and ReACT’s faculty founder Admir Masic. During the orientation for the new CDS cohort in January, Masic emphasized the great need for educational programs like ReACT’s that address the specific challenges refugees and displaced learners face.

    A former Bosnian refugee, Masic spent his teenage years in Croatia, where educational opportunities were limited for young people with refugee status. His experience motivated him to found ReACT, which launched in 2017. Housed within Open Learning, ReACT is an MIT-wide effort to deliver global education and professional development programs to underserved communities, including refugees and migrants. ReACT’s signature program, CDS is a year-long, online program that combines MITx courses in programming and data science, personal and professional development workshops including MIT Bootcamps, and opportunities for practical experience.

    ReACT’s group of 25 learners from Afghanistan, 52 percent of whom are women, joins the larger CDS cohort in the program. They will receive support from their new colleagues as well as members of ReACT’s mentor and alumni network. While the majority of the group are residing around the world, including in Europe, North America, and neighboring countries, several still remain in Afghanistan. With the support of the Afghan Working Group, ReACT is working to connect with communities from the region to provide safe and inclusive learning environments for the cohort. ​​

    Building community and confidence

    Selected from more than 1,000 applicants, the new CDS cohort reflected on their personal and professional goals during a weeklong orientation.

    “I am here because I want to change my career and learn basics in this field to then obtain networks that I wouldn’t have got if it weren’t for this program,” said Samiullah Ajmal, who is joining the program from Afghanistan.

    Interactive workshops on topics such as leadership development and virtual networking rounded out the week’s events. Members of ReACT’s greater community — which has grown in recent years to include a network of external collaborators including nonprofits, philanthropic supporters, universities, and alumni — helped facilitate these workshops and other orientation activities.

    For instance, Na’amal, a social enterprise that connects refugees to remote work opportunities, introduced the CDS learners to strategies for making career connections remotely. “We build confidence while doing,” says Susan Mulholland, a leadership and development coach with Na’amal who led the networking workshop.

    Along with the CDS program’s cohort-based model, ReACT also uses platforms that encourage regular communication between participants and with the larger ReACT network — making connections a critical component of the program.

    “I not only want to meet new people and make connections for my professional career, but I also want to test my communication and social skills,” says Pablo Andrés Uribe, a learner who lives in Colombia, describing ReACT’s emphasis on community-building. 

    Over the last two years, ReACT has expanded its geographic presence, growing from a hub in Jordan into a robust global community of many hubs, including in Colombia and Uganda. These regional sites connect talented refugees and displaced learners to internships and employment, startup networks and accelerators, and pathways to formal undergraduate and graduate education.

    This expansion is thanks to the generous support internally from the MIT Office of the Provost and Associate Provost Richard Lester and external organizations including the Western Union Foundation. ReACT will build new hubs this year in Greece, Uruguay, and Afghanistan, as a result of gifts from the Hatsopoulos family and the Pfeffer family.

    Holding space to learn from each other

    In addition to establishing new global hubs, ReACT plans to expand its network of internship and experiential learning opportunities, increasing outreach to new collaborators such as nongovernmental organizations (NGOs), companies, and universities. Jointly with Na’amal and Paper Airplanes, a nonprofit that connects conflict-affected individuals with personal language tutors, ReACT will host the first Migration Summit. Scheduled for April 2022, the month-long global convening invites a broad range of participants, including displaced learners, universities, companies, nonprofits and NGOs, social enterprises, foundations, philanthropists, researchers, policymakers, employers, and governments, to address the key challenges and opportunities for refugee and migrant communities. The theme of the summit is “Education and Workforce Development in Displacement.”

    “The MIT Migration Summit offers a platform to discuss how new educational models, such as those employed in ReACT, can help solve emerging challenges in providing quality education and career opportunities to forcibly displaced and marginalized people around the world,” says Masic. 

    A key goal of the convening is to center the voices of those most directly impacted by displacement, such as ReACT’s learners from Afghanistan and elsewhere, in solution-making. More

  • in

    Devavrat Shah appointed faculty director of the Deshpande Center

    Devavrat Shah, the Andrew (1956) and Erna Viterbi Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, has been named faculty director of the MIT Deshpande Center for Technological Innovations. The new role took effect on Feb. 1.

    Shah replaces Tim Swager, the John D. MacArthur Professor of Chemistry, who has held the position of faculty director since 2014. Working alongside Executive Director to the Deshpande Center Leon Sandler, Swager helped the Deshpande Center build an inclusive environment where innovation and entrepreneurship could thrive. By examining new models for directing, seeding, and fostering the commercialization of inventions and technology, Swager helped students and faculty breathe life into research, propelling it out of the lab and into the world as successful ventures.

    The MIT Deshpande Center for Technological Innovations is an interdepartmental center working to empower MIT’s most talented students and faculty by helping them bring new innovative technologies from the lab to the marketplace in the form of breakthrough products and new companies. Desh Deshpande founded the center with his wife, in 2002.

    “Professor Shah’s deep entrepreneurial experience coupled with his research on large complex networks will be tremendous assets to the center,” says Deshpande. “Devavrat is an impactful educator and inspiring mentor who will play a key role in the center’s mission to foster innovation and accelerate the impact of new discoveries.”

    Shah joined the Department of Electrical Engineering and Computer Science in 2005. With research focusing on statistical inference and stochastic networks, his research contributions span a variety of areas including resource allocation in communications networks, inference and learning on graphical models, algorithms for social data processing including ranking, recommendations and crowdsourcing, and more recently, causal inference using observational and experimental data.  

    While Shah’s work spans a range of areas across electrical engineering, computer science, and operations research, they are all tied together with the singular focus on developing algorithmic solutions for practical, challenging problems. He’s also authored two books, one on gossip algorithms in 2006 and the other on prediction methods of nearest neighbors in 2018. 

    A highly regarded teacher, Shah has been very active in curriculum development — most notably class 6.438 (Algorithms for Inference) and class 6.401 (Introduction to Statistical Data Analysis) — and has taken a leading role in developing educational programs in the statistics and data science at MIT as part of the Statistics and Data Science Center within the Institute for Data, Systems, and Society.

    “With his experience and contributions as a researcher, educator, and innovator, I have no doubt that Devavrat will excel as the next faculty director of the Deshpande Center and help usher in the next era of innovation for MIT,” says Anantha P. Chandrakasan, dean of the School of Engineering and Vannevar Bush Professor of Electrical Engineering and Computer Science. “I am grateful to Tim for the tremendous work he has done during his eight years as faculty director of the Deshpande Center. His commitment to building an inclusive environment for innovation and entrepreneurship to thrive was particularly impressive.” 

    A practiced entrepreneur, Shah co-founded Celect, Inc. — now part of Nike — in 2013, to help retailers accurately predict demand using omnichannel data. In 2019, he helped start IkigaiLabs, where he serves as CTO, with the mission to build self-driving organizations by enabling data-driven operations with human-in-the-loop with the ease of spreadsheet.

    Among his many achievements and accolades, Shah was named a Kavli Fellow of the National Academy of Science in 2014 and was just recently announced as an Institute of Electrical and Electronics Engineers (IEEE) Fellow for 2022. He’s also received a number of awards for his papers from INFORMS Applied Probability Society, INFORMS Management Science and Operations Management, NeurIPS, ACM Sigmetrics, and IEEE Infocom. His career prizes include the Erlang Prize from INFORMS Applied Probability Society and the Rising Star Award from ACM Sigmetrics. Shah has also received multiple Test of Time paper awards from ACM Sigmetrics and is recognized as a distinguished alumnus of his alma mater, the Indian Institute of Technology Bombay.

    “The Deshpande Center thanks Tim for his years of service as faculty director,” says the center’s executive director, Leon Sandler. “Tim’s commitment to innovation played an integral role in our success, and the center’s programs have thrived under his leadership. I look forward to working with Devavrat in the continuing effort to fulfill the mission of our center.”

    As part of his new post, Shah will work closely with Sandler, who has held the executive director position at the Deshpande Center since 2006. More

  • in

    MIT Center for Real Estate launches the Asia Real Estate Initiative

    To appreciate the explosive urbanization taking place in Asia, consider this analogy: Every 40 days, a city the equivalent size of Boston is built in Asia. Of the $24.7 trillion real estate investment opportunities predicted by 2030 in emerging cities, $17.8 trillion (72 percent) will be in Asia. While this growth is exciting to the real estate industry, it brings with it the attendant social and environmental issues.

    To promote a sustainable and innovative approach to this growth, leadership at the MIT Center for Real Estate (MIT CRE) recently established the Asia Real Estate Initiative (AREI), which aims to become a platform for industry leaders, entrepreneurs, and the academic community to find solutions to the practical concerns of real estate development across these countries.

    “Behind the creation of this initiative is the understanding that Asia is a living lab for the study of future global urban development,” says Hashim Sarkis, dean of the MIT School of Architecture and Planning.

    An investment in cities of the future

    One of the areas in AREI’s scope of focus is connecting sustainability and technology in real estate.

    “We believe the real estate sector should work cooperatively with the energy, science, and technology sectors to solve the climate challenges,” says Richard Lester, the Institute’s associate provost for international activities. “AREI will engage academics and industry leaders, nongovernment organizations, and civic leaders globally and in Asia, to advance sharing knowledge and research.”

    In its effort to understand how trends and new technologies will impact the future of real estate, AREI has received initial support from a prominent alumnus of MIT CRE who wishes to remain anonymous. The gift will support a cohort of researchers working on innovative technologies applicable to advancing real estate sustainability goals, with a special focus on the global and Asia markets. The call for applications is already under way, with AREI seeking to collaborate with scholars who have backgrounds in economics, finance, urban planning, technology, engineering, and other disciplines.

    “The research on real estate sustainability and technology could transform this industry and help invent global real estate of the future,” says Professor Siqi Zheng, faculty director of MIT CRE and AREI faculty chair. “The pairing of real estate and technology often leads to innovative and differential real estate development strategies such as buildings that are green, smart, and healthy.”

    The initiative arrives at a key time to make a significant impact and cement a leadership role in real estate development across Asia. MIT CRE is positioned to help the industry increase its efficiency and social responsibility, with nearly 40 years of pioneering research in the field. Zheng, an established scholar with expertise on urban growth in fast-urbanizing regions, is the former president of the Asia Real Estate Society and sits on the Board of American Real Estate and Urban Economics Association. Her research has been supported by international institutions including the World Bank, the Asian Development Bank, and the Lincoln Institute of Land Policy.

    “The researchers in AREI are now working on three interrelated themes: the future of real estate and live-work-play dynamics; connecting sustainability and technology in real estate; and innovations in real estate finance and business,” says Zheng.

    The first theme has already yielded a book — “Toward Urban Economic Vibrancy: Patterns and Practices in Asia’s New Cities” — recently published by SA+P Press.

    Engaging thought leaders and global stakeholders

    AREI also plans to collaborate with counterparts in Asia to contribute to research, education, and industry dialogue to meet the challenges of sustainable city-making across the continent and identify areas for innovation. Traditionally, real estate has been a very local business with a lengthy value chain, according to Zhengzhen Tan, director of AREI. Most developers focused their career on one particular product type in one particular regional market. AREI is working to change that dynamic.

    “We want to create a cross-border dialogue within Asia and among Asia, North America, and European leaders to exchange knowledge and practices,” says Tan. “The real estate industry’s learning costs are very high compared to other sectors. Collective learning will reduce the cost of failure and have a significant impact on these global issues.”

    The 2021 United Nations Climate Change Conference in Glasgow shed additional light on environmental commitments being made by governments in Asia. With real estate representing 40 percent of global greenhouse gas emissions, the Asian real estate market is undergoing an urgent transformation to deliver on this commitment.

    “One of the most pressing calls is to get to net-zero emissions for real estate development and operation,” says Tan. “Real estate investors and developers are making short- and long-term choices that are locking in environmental footprints for the ‘decisive decade.’ We hope to inspire developers and investors to think differently and get out of their comfort zone.” More

  • in

    Unlocking new doors to artificial intelligence

    Artificial intelligence research is constantly developing new hypotheses that have the potential to benefit society and industry; however, sometimes these benefits are not fully realized due to a lack of engineering tools. To help bridge this gap, graduate students in the MIT Department of Electrical Engineering and Computer Science’s 6-A Master of Engineering (MEng) Thesis Program work with some of the most innovative companies in the world and collaborate on cutting-edge projects, while contributing to and completing their MEng thesis.

    During a portion of the last year, four 6-A MEng students teamed up and completed an internship with IBM Research’s advanced prototyping team through the MIT-IBM Watson AI Lab on AI projects, often developing web applications to solve a real-world issue or business use cases. Here, the students worked alongside AI engineers, user experience engineers, full-stack researchers, and generalists to accommodate project requests and receive thesis advice, says Lee Martie, IBM research staff member and 6-A manager. The students’ projects ranged from generating synthetic data to allow for privacy-sensitive data analysis to using computer vision to identify actions in video that allows for monitoring human safety and tracking build progress on a construction site.

    “I appreciated all of the expertise from the team and the feedback,” says 6-A graduate Violetta Jusiega ’21, who participated in the program. “I think that working in industry gives the lens of making sure that the project’s needs are satisfied and [provides the opportunity] to ground research and make sure that it is helpful for some use case in the future.”

    Jusiega’s research intersected the fields of computer vision and design to focus on data visualization and user interfaces for the medical field. Working with IBM, she built an application programming interface (API) that let clinicians interact with a medical treatment strategy AI model, which was deployed in the cloud. Her interface provided a medical decision tree, as well as some prescribed treatment plans. After receiving feedback on her design from physicians at a local hospital, Jusiega developed iterations of the API and how the results where displayed, visually, so that it would be user-friendly and understandable for clinicians, who don’t usually code. She says that, “these tools are often not acquired into the field because they lack some of these API principles which become more important in an industry where everything is already very fast paced, so there’s little time to incorporate a new technology.” But this project might eventually allow for industry deployment. “I think this application has a bunch of potential, whether it does get picked up by clinicians or whether it’s simply used in research. It’s very promising and very exciting to see how technology can help us modify, or I can improve, the health-care field to be even more custom-tailored towards patients and giving them the best care possible,” she says.

    Another 6-A graduate student, Spencer Compton, was also considering aiding professionals to make more informed decisions, for use in settings including health care, but he was tackling it from a causal perspective. When given a set of related variables, Compton was investigating if there was a way to determine not just correlation, but the cause-and-effect relationship between them (the direction of the interaction) from the data alone. For this, he and his collaborators from IBM Research and Purdue University turned to a field of math called information theory. With the goal of designing an algorithm to learn complex networks of causal relationships, Compton used ideas relating to entropy, the randomness in a system, to help determine if a causal relationship is present and how variables might be interacting. “When judging an explanation, people often default to Occam’s razor” says Compton. “We’re more inclined to believe a simpler explanation than a more complex one.” In many cases, he says, it seemed to perform well. For instance, they were able to consider variables such as lung cancer, pollution, and X-ray findings. He was pleased that his research allowed him to help create a framework of “entropic causal inference” that could aid in safe and smart decisions in the future, in a satisfying way. “The math is really surprisingly deep, interesting, and complex,” says Compton. “We’re basically asking, ‘when is the simplest explanation correct?’ but as a math question.”

    Determining relationships within data can sometimes require large volumes of it to suss out patterns, but for data that may contain sensitive information, this may not be available. For her master’s work, Ivy Huang worked with IBM Research to generate synthetic tabular data using a natural language processing tool called a transformer model, which can learn and predict future values from past values. Trained on real data, the model can produce new data with similar patterns, properties, and relationships without restrictions like privacy, availability, and access that might come with real data in financial transactions and electronic medical records. Further, she created an API and deployed the model in an IBM cluster, which allowed users increased access to the model and abilities to query it without compromising the original data.

    Working with the advanced prototyping team, MEng candidate Brandon Perez also considered how to gather and investigate data with restrictions, but in his case it was to use computer vision frameworks, centered on an action recognition model, to identify construction site happenings. The team based their work on the Moments in Time dataset, which contains over a million three-second video clips with about 300 attached classification labels, and has performed well during AI training. However, the group needed more construction-based video data. For this, they used YouTube-8M. Perez built a framework for testing and fine-tuning existing object detection models and action recognition models that could plug into an automatic spatial and temporal localization tool — how they would identify and label particular actions in a video timeline. “I was satisfied that I was able to explore what made me curious, and I was grateful for the autonomy that I was given with this project,” says Perez. “I felt like I was always supported, and my mentor was a great support to the project.”

    “The kind of collaborations that we have seen between our MEng students and IBM researchers are exactly what the 6-A MEng Thesis program at MIT is all about,” says Tomas Palacios, professor of electrical engineering and faculty director of the MIT 6-A MEng Thesis program. “For more than 100 years, 6-A has been connecting MIT students with industry to solve together some of the most important problems in the world.” More

  • in

    3 Questions: Fotini Christia on racial equity and data science

    Fotini Christia is the Ford International Professor in the Social Sciences in the Department of Political Science, associate director of the Institute for Data, Systems, and Society (IDSS), and director of the Sociotechnical Systems Research Center (SSRC). Her research interests include issues of conflict and cooperation in the Muslim world, and she has conducted fieldwork in Afghanistan, Bosnia, Iran, the Palestinian Territories, Syria, and Yemen. She has co-organized the IDSS Research Initiative on Combatting Systemic Racism (ICSR), which works to bridge the social sciences, data science, and computation by bringing researchers from these disciplines together to address systemic racism across housing, health care, policing, education, employment, and other sectors of society.

    Q: What is the IDSS/ICSR approach to systemic racism research?

    A: The Research Initiative on Combatting Systemic Racism (ICSR) aims to seed and coordinate cross-disciplinary research to identify and overcome racially discriminatory processes and outcomes across a range of U.S. institutions and policy domains.

    Building off the extensive social science literature on systemic racism, the focus of this research initiative is to use big data to develop and harness computational tools that can help effect structural and normative change toward racial equity.

    The initiative aims to create a visible presence at MIT for cutting-edge computational research with a racial equity lens, across societal domains that will attract and train students and scholars.

    The steering committee for this research initiative is composed of underrepresented minority faculty members from across MIT’s five schools and the MIT Schwarzman College of Computing. Members will serve as close advisors to the initiative as well as share the findings of our work beyond MIT’s campus. MIT Chancellor Melissa Nobles heads this committee.

    Q: What role can data science play in helping to effect change toward racial equity?

    A: Existing work has shown racial discrimination in the job market, in the criminal justice system, as well as in education, health care, and access to housing, among other places. It has also underlined how algorithms could further entrench such bias — be it in training data or in the people who build them. Data science tools can not only help identify, but also contribute to, proposing fixes on racially inequitable outcomes that result from implicit or explicit biases in governing institutional practices in the public and private sector, and more recently from the use of AI and algorithmic methods in decision-making.

    To that effect, this initiative will produce research that explores and collects the relevant big data across domains, while paying attention to the ways such data are collected, and focus on improving and developing data-driven computational tools to address racial disparities in structures and institutions that have reproduced racially discriminatory outcomes in American society.

    The strong correlation between race, class, educational attainment, and various attitudes and behaviors in the American context can make it extremely difficult to rule out the influence of confounding factors. Thus, a key motivation for our research initiative is to highlight the importance of causal analysis using computational methods, and focus on understanding the opportunities of big data and algorithmic decision-making to address racial inequities and promote racial justice — beyond de-biasing algorithms. The intent is to also codify methodologies on equity-informed research practices and produce tools that are clear on the quantifiable expected social costs and benefits, as well as on the downstream effects on systemic racism more broadly.

    Q: What are some ways that the ICSR might conduct or follow-up on research seeking real-world impact or policy change?

    A: This type of research has ethical and societal considerations at its core, especially as they pertain to historically disadvantaged groups in the U.S., and will be coordinated with and communicated to local stakeholders to drive relevant policy decisions. This initiative intends to establish connections to URM [underrepresented minority] researchers and students at underrepresented universities and to directly collaborate with them on these research efforts. To that effect, we are leveraging existing programs such as the MIT Summer Research Program (MSRP).

    To ensure that our research targets the right problems bringing a racial equity lens with an interest to effect policy change, we will also connect with community organizations in minority neighborhoods who often bear the brunt of the direct and indirect effects of systemic racism, as well as with local government offices that work to address inequity in service provision in these communities. Our intent is to directly engage IDSS students with these organizations to help develop and test algorithmic tools for racial equity. More

  • in

    Injecting fairness into machine-learning models

    If a machine-learning model is trained using an unbalanced dataset, such as one that contains far more images of people with lighter skin than people with darker skin, there is serious risk the model’s predictions will be unfair when it is deployed in the real world.

    But this is only one part of the problem. MIT researchers have found that machine-learning models that are popular for image recognition tasks actually encode bias when trained on unbalanced data. This bias within the model is impossible to fix later on, even with state-of-the-art fairness-boosting techniques, and even when retraining the model with a balanced dataset.      

    So, the researchers came up with a technique to introduce fairness directly into the model’s internal representation itself. This enables the model to produce fair outputs even if it is trained on unfair data, which is especially important because there are very few well-balanced datasets for machine learning.

    The solution they developed not only leads to models that make more balanced predictions, but also improves their performance on downstream tasks like facial recognition and animal species classification.

    “In machine learning, it is common to blame the data for bias in models. But we don’t always have balanced data. So, we need to come up with methods that actually fix the problem with imbalanced data,” says lead author Natalie Dullerud, a graduate student in the Healthy ML Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT.

    Dullerud’s co-authors include Kimia Hamidieh, a graduate student in the Healthy ML Group; Karsten Roth, a former visiting researcher who is now a graduate student at the University of Tubingen; Nicolas Papernot, an assistant professor in the University of Toronto’s Department of Electrical Engineering and Computer Science; and senior author Marzyeh Ghassemi, an assistant professor and head of the Healthy ML Group. The research will be presented at the International Conference on Learning Representations.

    Defining fairness

    The machine-learning technique the researchers studied is known as deep metric learning, which is a broad form of representation learning. In deep metric learning, a neural network learns the similarity between objects by mapping similar photos close together and dissimilar photos far apart. During training, this neural network maps images in an “embedding space” where a similarity metric between photos corresponds to the distance between them.

    For example, if a deep metric learning model is being used to classify bird species, it will map photos of golden finches together in one part of the embedding space and cardinals together in another part of the embedding space. Once trained, the model can effectively measure the similarity of new images it hasn’t seen before. It would learn to cluster images of an unseen bird species close together, but farther from cardinals or golden finches within the embedding space.

    The similarity metrics the model learns are very robust, which is why deep metric learning is so often employed for facial recognition, Dullerud says. But she and her colleagues wondered how to determine if a similarity metric is biased.

    “We know that data reflect the biases of processes in society. This means we have to shift our focus to designing methods that are better suited to reality,” says Ghassemi.

    The researchers defined two ways that a similarity metric can be unfair. Using the example of facial recognition, the metric will be unfair if it is more likely to embed individuals with darker-skinned faces closer to each other, even if they are not the same person, than it would if those images were people with lighter-skinned faces. Second, it will be unfair if the features it learns for measuring similarity are better for the majority group than for the minority group.

    The researchers ran a number of experiments on models with unfair similarity metrics and were unable to overcome the bias the model had learned in its embedding space.

    “This is quite scary because it is a very common practice for companies to release these embedding models and then people finetune them for some downstream classification task. But no matter what you do downstream, you simply can’t fix the fairness problems that were induced in the embedding space,” Dullerud says.

    Even if a user retrains the model on a balanced dataset for the downstream task, which is the best-case scenario for fixing the fairness problem, there are still performance gaps of at least 20 percent, she says.

    The only way to solve this problem is to ensure the embedding space is fair to begin with.

    Learning separate metrics

    The researchers’ solution, called Partial Attribute Decorrelation (PARADE), involves training the model to learn a separate similarity metric for a sensitive attribute, like skin tone, and then decorrelating the skin tone similarity metric from the targeted similarity metric. If the model is learning the similarity metrics of different human faces, it will learn to map similar faces close together and dissimilar faces far apart using features other than skin tone.

    Any number of sensitive attributes can be decorrelated from the targeted similarity metric in this way. And because the similarity metric for the sensitive attribute is learned in a separate embedding space, it is discarded after training so only the targeted similarity metric remains in the model.

    Their method is applicable to many situations because the user can control the amount of decorrelation between similarity metrics. For instance, if the model will be diagnosing breast cancer from mammogram images, a clinician likely wants some information about biological sex to remain in the final embedding space because it is much more likely that women will have breast cancer than men, Dullerud explains.

    They tested their method on two tasks, facial recognition and classifying bird species, and found that it reduced performance gaps caused by bias, both in the embedding space and in the downstream task, regardless of the dataset they used.

    Moving forward, Dullerud is interested in studying how to force a deep metric learning model to learn good features in the first place.

    “How do you properly audit fairness? That is an open question right now. How can you tell that a model is going to be fair, or that it is only going to be fair in certain situations, and what are those situations? Those are questions I am really interested in moving forward,” she says. More

  • in

    Using artificial intelligence to find anomalies hiding in massive datasets

    Identifying a malfunction in the nation’s power grid can be like trying to find a needle in an enormous haystack. Hundreds of thousands of interrelated sensors spread across the U.S. capture data on electric current, voltage, and other critical information in real time, often taking multiple recordings per second.

    Researchers at the MIT-IBM Watson AI Lab have devised a computationally efficient method that can automatically pinpoint anomalies in those data streams in real time. They demonstrated that their artificial intelligence method, which learns to model the interconnectedness of the power grid, is much better at detecting these glitches than some other popular techniques.

    Because the machine-learning model they developed does not require annotated data on power grid anomalies for training, it would be easier to apply in real-world situations where high-quality, labeled datasets are often hard to come by. The model is also flexible and can be applied to other situations where a vast number of interconnected sensors collect and report data, like traffic monitoring systems. It could, for example, identify traffic bottlenecks or reveal how traffic jams cascade.

    “In the case of a power grid, people have tried to capture the data using statistics and then define detection rules with domain knowledge to say that, for example, if the voltage surges by a certain percentage, then the grid operator should be alerted. Such rule-based systems, even empowered by statistical data analysis, require a lot of labor and expertise. We show that we can automate this process and also learn patterns from the data using advanced machine-learning techniques,” says senior author Jie Chen, a research staff member and manager of the MIT-IBM Watson AI Lab.

    The co-author is Enyan Dai, an MIT-IBM Watson AI Lab intern and graduate student at the Pennsylvania State University. This research will be presented at the International Conference on Learning Representations.

    Probing probabilities

    The researchers began by defining an anomaly as an event that has a low probability of occurring, like a sudden spike in voltage. They treat the power grid data as a probability distribution, so if they can estimate the probability densities, they can identify the low-density values in the dataset. Those data points which are least likely to occur correspond to anomalies.

    Estimating those probabilities is no easy task, especially since each sample captures multiple time series, and each time series is a set of multidimensional data points recorded over time. Plus, the sensors that capture all that data are conditional on one another, meaning they are connected in a certain configuration and one sensor can sometimes impact others.

    To learn the complex conditional probability distribution of the data, the researchers used a special type of deep-learning model called a normalizing flow, which is particularly effective at estimating the probability density of a sample.

    They augmented that normalizing flow model using a type of graph, known as a Bayesian network, which can learn the complex, causal relationship structure between different sensors. This graph structure enables the researchers to see patterns in the data and estimate anomalies more accurately, Chen explains.

    “The sensors are interacting with each other, and they have causal relationships and depend on each other. So, we have to be able to inject this dependency information into the way that we compute the probabilities,” he says.

    This Bayesian network factorizes, or breaks down, the joint probability of the multiple time series data into less complex, conditional probabilities that are much easier to parameterize, learn, and evaluate. This allows the researchers to estimate the likelihood of observing certain sensor readings, and to identify those readings that have a low probability of occurring, meaning they are anomalies.

    Their method is especially powerful because this complex graph structure does not need to be defined in advance — the model can learn the graph on its own, in an unsupervised manner.

    A powerful technique

    They tested this framework by seeing how well it could identify anomalies in power grid data, traffic data, and water system data. The datasets they used for testing contained anomalies that had been identified by humans, so the researchers were able to compare the anomalies their model identified with real glitches in each system.

    Their model outperformed all the baselines by detecting a higher percentage of true anomalies in each dataset.

    “For the baselines, a lot of them don’t incorporate graph structure. That perfectly corroborates our hypothesis. Figuring out the dependency relationships between the different nodes in the graph is definitely helping us,” Chen says.

    Their methodology is also flexible. Armed with a large, unlabeled dataset, they can tune the model to make effective anomaly predictions in other situations, like traffic patterns.

    Once the model is deployed, it would continue to learn from a steady stream of new sensor data, adapting to possible drift of the data distribution and maintaining accuracy over time, says Chen.

    Though this particular project is close to its end, he looks forward to applying the lessons he learned to other areas of deep-learning research, particularly on graphs.

    Chen and his colleagues could use this approach to develop models that map other complex, conditional relationships. They also want to explore how they can efficiently learn these models when the graphs become enormous, perhaps with millions or billions of interconnected nodes. And rather than finding anomalies, they could also use this approach to improve the accuracy of forecasts based on datasets or streamline other classification techniques.

    This work was funded by the MIT-IBM Watson AI Lab and the U.S. Department of Energy. More