More stories

  • in

    Empowering Cambridge youth through data activism

    For over 40 years, the Mayor’s Summer Youth Employment Program (MSYEP, or the Mayor’s Program) in Cambridge, Massachusetts, has been providing teenagers with their first work experience, but 2022 brought a new offering. Collaborating with MIT’s Personal Robots research group (PRG) and Responsible AI for Social Empowerment and Education (RAISE) this summer, MSYEP created a STEAM-focused learning site at the Institute. Eleven students joined the program to learn coding and programming skills through the lens of “Data Activism.”

    MSYEP’s partnership with MIT provides an opportunity for Cambridge high schoolers to gain exposure to more pathways for their future careers and education. The Mayor’s Program aims to respect students’ time and show the value of their work, so participants are compensated with an hourly wage as they learn workforce skills at MSYEP worksites. In conjunction with two ongoing research studies at MIT, PRG and RAISE developed the six-week Data Activism curriculum to equip students with critical-thinking skills so they feel prepared to utilize data science to challenge social injustice and empower their community.

    Rohan Kundargi, K-12 Community Outreach Administrator for MIT Office of Government and Community Relations (OGCR), says, “I see this as a model for a new type of partnership between MIT and Cambridge MSYEP. Specifically, an MIT research project that involves students from Cambridge getting paid to learn, research, and develop their own skills!”

    Cross-Cambridge collaboration

    Cambridge’s Office of Workforce Development initially contacted MIT OGCR about hosting a potential MSYEP worksite that taught Cambridge teens how to code. When Kundargi reached out to MIT pK-12 collaborators, MIT PRG’s graduate research assistant Raechel Walker proposed the Data Activism curriculum. Walker defines “data activism” as utilizing data, computing, and art to analyze how power operates in the world, challenge power, and empathize with people who are oppressed.

    Walker says, “I wanted students to feel empowered to incorporate their own expertise, talents, and interests into every activity. In order for students to fully embrace their academic abilities, they must remain comfortable with bringing their full selves into data activism.”

    As Kundargi and Walker recruited students for the Data Activism learning site, they wanted to make sure the cohort of students — the majority of whom are individuals of color — felt represented at MIT and felt they had the agency for their voice to be heard. “The pioneers in this field are people who look like them,” Walker says, speaking of well-known data activists Timnit Gebru, Rediet Abebe, and Joy Buolamwini.

    When the program began this summer, some of the students were not aware of the ways data science and artificial intelligence exacerbate systemic oppression in society, or some of the tools currently being used to mitigate those societal harms. As a result, Walker says, the students wanted to learn more about discriminatory design in every aspect of life. They were also interested in creating responsible machine learning algorithms and AI fairness metrics.

    A different side of STEAM

    The development and execution of the Data Activism curriculum contributed to Walker’s and postdoc Xiaoxue Du’s respective research at PRG. Walker is studying AI education, specifically creating and teaching data activism curricula for minoritized communities. Du’s research explores processes, assessments, and curriculum design that prepares educators to use, adapt, and integrate AI literacy curricula. Additionally, her research targets how to leverage more opportunities for students with diverse learning needs.

    The Data Activism curriculum utilizes a “libertatory computing” framework, a term Walker coined in her position paper with Professor Cynthia Breazeal, director of MIT RAISE, dean for digital learning, and head of PRG, and Eman Sherif, a then-undergraduate researcher from University of California at San Diego, titled “Liberty Computing for African American Students.” This framework ensures that students, especially minoritized students, acquire a sound racial identity, critical consciousness, collective obligation, liberation centered academic/achievement identity, as well as the activism skills to use computing to transform a multi-layered system of barriers in which racism persists. Walker says, “We encouraged students to demonstrate competency in every pillar because all of the pillars are interconnected and build upon each other.”

    Walker developed a series of interactive coding and project-based activities that focused on understanding systemic racism, utilizing data science to analyze systemic oppression, data drawing, responsible machine learning, how racism can be embedded into AI, and different AI fairness metrics.

    This was the students’ first time learning how to create data visualizations using the programming language Python and the data analysis tool Pandas. In one project meant to examine how different systems of oppression can affect different aspects of students’ own identities, students created datasets with data from their respective intersectional identities. Another activity highlighted African American achievements, where students analyzed two datasets about African American scientists, activists, artists, scholars, and athletes. Using the data visualizations, students then created zines about the African Americans who inspired them.

    RAISE hired Olivia Dias, Sophia Brady, Lina Henriquez, and Zeynep Yalcin through the MIT Undergraduate Research Opportunity Program (UROP) and PRG hired freelancer Matt Taylor to work with Walker on developing the curriculum and designing interdisciplinary experience projects. Walker and the four undergraduate researchers constructed an intersectional data analysis activity about different examples of systemic oppression. PRG also hired three high school students to test activities and offer insights about making the curriculum engaging for program participants. Throughout the program, the Data Activism team taught students in small groups, continually asked students how to improve each activity, and structured each lesson based on the students’ interests. Walker says Dias, Brady, Henriquez, and Yalcin were invaluable to cultivating a supportive classroom environment and helping students complete their projects.

    Cambridge Rindge and Latin School senior Nina works on her rubber block stamp that depicts the importance of representation in media and greater representation in the tech industry.

    Photo: Katherine Ouellette

    Previous item
    Next item

    Student Nina says, “It’s opened my eyes to a different side of STEM. I didn’t know what ‘data’ meant before this program, or how intersectionality can affect AI and data.” Before MSYEP, Nina took Intro to Computer Science and AP Computer Science, but she has been coding since Girls Who Code first sparked her interest in middle school. “The community was really nice. I could talk with other girls. I saw there needs to be more women in STEM, especially in coding.” Now she’s interested in applying to colleges with strong computer science programs so she can pursue a coding-related career.

    From MSYEP to the mayor’s office

    Mayor Sumbul Siddiqui visited the Data Activism learning site on Aug. 9, accompanied by Breazeal. A graduate of MSYEP herself, Siddiqui says, “Through hands-on learning through computer programming, Cambridge high school students have the unique opportunity to see themselves as data scientists. Students were able learn ways to combat discrimination that occurs through artificial intelligence.” In an Instagram post, Siddiqui also said, “I had a blast visiting the students and learning about their projects.”

    Students worked on an activity that asked them to envision how data science might be used to support marginalized communities. They transformed their answers into block-printed T-shirt designs, carving pictures of their hopes into rubber block stamps. Some students focused on the importance of data privacy, like Jacob T., who drew a birdcage to represent data stored and locked away by third party apps. He says, “I want to open that cage and restore my data to myself and see what can be done with it.”

    The subject of Cambridge Community Charter School student Jacob T.’s project was the importance of data privacy. For his T-shirt design, he drew a birdcage to represent data stored and locked away by third party apps. (From right to left:) Breazeal, Jacob T. Kiki, Raechel Walker, and Zeynep Yalcin.

    Photo: Katherine Ouellette

    Previous item
    Next item

    Many students wanted to see more representation in both the media they consume and across various professional fields. Nina talked about the importance of representation in media and how that could contribute to greater representation in the tech industry, while Kiki talked about encouraging more women to pursue STEM fields. Jesmin said, “I wanted to show that data science is accessible to everyone, no matter their origin or language you speak. I wrote ‘hello’ in Bangla, Arabic, and English, because I speak all three languages and they all resonate with me.”

    Student Jesmin (left) explains the concept of her T-shirt design to Mayor Siddiqui. She wants data science to be accessible to everyone, no matter their origin or language, so she drew a globe and wrote ‘hello’ in the three languages she speaks: Bangla, Arabic, and English.

    Photo: Katherine Ouellette

    Previous item
    Next item

    “Overall, I hope the students continue to use their data activism skills to re-envision a society that supports marginalized groups,” says Walker. “Moreover, I hope they are empowered to become data scientists and understand how their race can be a positive part of their identity.” More

  • in

    Computing for the health of the planet

    The health of the planet is one of the most important challenges facing humankind today. From climate change to unsafe levels of air and water pollution to coastal and agricultural land erosion, a number of serious challenges threaten human and ecosystem health.

    Ensuring the health and safety of our planet necessitates approaches that connect scientific, engineering, social, economic, and political aspects. New computational methods can play a critical role by providing data-driven models and solutions for cleaner air, usable water, resilient food, efficient transportation systems, better-preserved biodiversity, and sustainable sources of energy.

    The MIT Schwarzman College of Computing is committed to hiring multiple new faculty in computing for climate and the environment, as part of MIT’s plan to recruit 20 climate-focused faculty under its climate action plan. This year the college undertook searches with several departments in the schools of Engineering and Science for shared faculty in computing for health of the planet, one of the six strategic areas of inquiry identified in an MIT-wide planning process to help focus shared hiring efforts. The college also undertook searches for core computing faculty in the Department of Electrical Engineering and Computer Science (EECS).

    The searches are part of an ongoing effort by the MIT Schwarzman College of Computing to hire 50 new faculty — 25 shared with other academic departments and 25 in computer science and artificial intelligence and decision-making. The goal is to build capacity at MIT to help more deeply infuse computing and other disciplines in departments.

    Four interdisciplinary scholars were hired in these searches. They will join the MIT faculty in the coming year to engage in research and teaching that will advance physical understanding of low-carbon energy solutions, Earth-climate modeling, biodiversity monitoring and conservation, and agricultural management through high-performance computing, transformational numerical methods, and machine-learning techniques.

    “By coordinating hiring efforts with multiple departments and schools, we were able to attract a cohort of exceptional scholars in this area to MIT. Each of them is developing and using advanced computational methods and tools to help find solutions for a range of climate and environmental issues,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Warren Ellis Professor of Electrical Engineering and Computer Science. “They will also help strengthen cross-departmental ties in computing across an important, critical area for MIT and the world.”

    “These strategic hires in the area of computing for climate and the environment are an incredible opportunity for the college to deepen its academic offerings and create new opportunity for collaboration across MIT,” says Anantha P. Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “The college plays a pivotal role in MIT’s overarching effort to hire climate-focused faculty — introducing the critical role of computing to address the health of the planet through innovative research and curriculum.”

    The four new faculty members are:

    Sara Beery will join MIT as an assistant professor in the Faculty of Artificial Intelligence and Decision-Making in EECS in September 2023. Beery received her PhD in computing and mathematical sciences at Caltech in 2022, where she was advised by Pietro Perona. Her research focuses on building computer vision methods that enable global-scale environmental and biodiversity monitoring across data modalities, tackling real-world challenges including strong spatiotemporal correlations, imperfect data quality, fine-grained categories, and long-tailed distributions. She partners with nongovernmental organizations and government agencies to deploy her methods in the wild worldwide and works toward increasing the diversity and accessibility of academic research in artificial intelligence through interdisciplinary capacity building and education.

    Priya Donti will join MIT as an assistant professor in the faculties of Electrical Engineering and Artificial Intelligence and Decision-Making in EECS in academic year 2023-24. Donti recently finished her PhD in the Computer Science Department and the Department of Engineering and Public Policy at Carnegie Mellon University, co-advised by Zico Kolter and Inês Azevedo. Her work focuses on machine learning for forecasting, optimization, and control in high-renewables power grids. Specifically, her research explores methods to incorporate the physics and hard constraints associated with electric power systems into deep learning models. Donti is also co-founder and chair of Climate Change AI, a nonprofit initiative to catalyze impactful work at the intersection of climate change and machine learning that is currently running through the Cornell Tech Runway Startup Postdoc Program.

    Ericmoore Jossou will join MIT as an assistant professor in a shared position between the Department of Nuclear Science and Engineering and the faculty of electrical engineering in EECS in July 2023. He is currently an assistant scientist at the Brookhaven National Laboratory, a U.S. Department of Energy-affiliated lab that conducts research in nuclear and high energy physics, energy science and technology, environmental and bioscience, nanoscience, and national security. His research at MIT will focus on understanding the processing-structure-properties correlation of materials for nuclear energy applications through advanced experiments, multiscale simulations, and data science. Jossou obtained his PhD in mechanical engineering in 2019 from the University of Saskatchewan.

    Sherrie Wang will join MIT as an assistant professor in a shared position between the Department of Mechanical Engineering and the Institute for Data, Systems, and Society in academic year 2023-24. Wang is currently a Ciriacy-Wantrup Postdoctoral Fellow at the University of California at Berkeley, hosted by Solomon Hsiang and the Global Policy Lab. She develops machine learning for Earth observation data. Her primary application areas are improving agricultural management and forecasting climate phenomena. She obtained her PhD in computational and mathematical engineering from Stanford University in 2021, where she was advised by David Lobell. More

  • in

    AI that can learn the patterns of human language

    Human languages are notoriously complex, and linguists have long thought it would be impossible to teach a machine how to analyze speech sounds and word structures in the way human investigators do.

    But researchers at MIT, Cornell University, and McGill University have taken a step in this direction. They have demonstrated an artificial intelligence system that can learn the rules and patterns of human languages on its own.

    When given words and examples of how those words change to express different grammatical functions (like tense, case, or gender) in one language, this machine-learning model comes up with rules that explain why the forms of those words change. For instance, it might learn that the letter “a” must be added to end of a word to make the masculine form feminine in Serbo-Croatian.

    This model can also automatically learn higher-level language patterns that can apply to many languages, enabling it to achieve better results.

    The researchers trained and tested the model using problems from linguistics textbooks that featured 58 different languages. Each problem had a set of words and corresponding word-form changes. The model was able to come up with a correct set of rules to describe those word-form changes for 60 percent of the problems.

    This system could be used to study language hypotheses and investigate subtle similarities in the way diverse languages transform words. It is especially unique because the system discovers models that can be readily understood by humans, and it acquires these models from small amounts of data, such as a few dozen words. And instead of using one massive dataset for a single task, the system utilizes many small datasets, which is closer to how scientists propose hypotheses — they look at multiple related datasets and come up with models to explain phenomena across those datasets.

    “One of the motivations of this work was our desire to study systems that learn models of datasets that is represented in a way that humans can understand. Instead of learning weights, can the model learn expressions or rules? And we wanted to see if we could build this system so it would learn on a whole battery of interrelated datasets, to make the system learn a little bit about how to better model each one,” says Kevin Ellis ’14, PhD ’20, an assistant professor of computer science at Cornell University and lead author of the paper.

    Joining Ellis on the paper are MIT faculty members Adam Albright, a professor of linguistics; Armando Solar-Lezama, a professor and associate director of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Joshua B. Tenenbaum, the Paul E. Newton Career Development Professor of Cognitive Science and Computation in the Department of Brain and Cognitive Sciences and a member of CSAIL; as well as senior author

    Timothy J. O’Donnell, assistant professor in the Department of Linguistics at McGill University, and Canada CIFAR AI Chair at the Mila – Quebec Artificial Intelligence Institute.

    The research is published today in Nature Communications.

    Looking at language 

    In their quest to develop an AI system that could automatically learn a model from multiple related datasets, the researchers chose to explore the interaction of phonology (the study of sound patterns) and morphology (the study of word structure).

    Data from linguistics textbooks offered an ideal testbed because many languages share core features, and textbook problems showcase specific linguistic phenomena. Textbook problems can also be solved by college students in a fairly straightforward way, but those students typically have prior knowledge about phonology from past lessons they use to reason about new problems.

    Ellis, who earned his PhD at MIT and was jointly advised by Tenenbaum and Solar-Lezama, first learned about morphology and phonology in an MIT class co-taught by O’Donnell, who was a postdoc at the time, and Albright.

    “Linguists have thought that in order to really understand the rules of a human language, to empathize with what it is that makes the system tick, you have to be human. We wanted to see if we can emulate the kinds of knowledge and reasoning that humans (linguists) bring to the task,” says Albright.

    To build a model that could learn a set of rules for assembling words, which is called a grammar, the researchers used a machine-learning technique known as Bayesian Program Learning. With this technique, the model solves a problem by writing a computer program.

    In this case, the program is the grammar the model thinks is the most likely explanation of the words and meanings in a linguistics problem. They built the model using Sketch, a popular program synthesizer which was developed at MIT by Solar-Lezama.

    But Sketch can take a lot of time to reason about the most likely program. To get around this, the researchers had the model work one piece at a time, writing a small program to explain some data, then writing a larger program that modifies that small program to cover more data, and so on.

    They also designed the model so it learns what “good” programs tend to look like. For instance, it might learn some general rules on simple Russian problems that it would apply to a more complex problem in Polish because the languages are similar. This makes it easier for the model to solve the Polish problem.

    Tackling textbook problems

    When they tested the model using 70 textbook problems, it was able to find a grammar that matched the entire set of words in the problem in 60 percent of cases, and correctly matched most of the word-form changes in 79 percent of problems.

    The researchers also tried pre-programming the model with some knowledge it “should” have learned if it was taking a linguistics course, and showed that it could solve all problems better.

    “One challenge of this work was figuring out whether what the model was doing was reasonable. This isn’t a situation where there is one number that is the single right answer. There is a range of possible solutions which you might accept as right, close to right, etc.,” Albright says.

    The model often came up with unexpected solutions. In one instance, it discovered the expected answer to a Polish language problem, but also another correct answer that exploited a mistake in the textbook. This shows that the model could “debug” linguistics analyses, Ellis says.

    The researchers also conducted tests that showed the model was able to learn some general templates of phonological rules that could be applied across all problems.

    “One of the things that was most surprising is that we could learn across languages, but it didn’t seem to make a huge difference,” says Ellis. “That suggests two things. Maybe we need better methods for learning across problems. And maybe, if we can’t come up with those methods, this work can help us probe different ideas we have about what knowledge to share across problems.”

    In the future, the researchers want to use their model to find unexpected solutions to problems in other domains. They could also apply the technique to more situations where higher-level knowledge can be applied across interrelated datasets. For instance, perhaps they could develop a system to infer differential equations from datasets on the motion of different objects, says Ellis.

    “This work shows that we have some methods which can, to some extent, learn inductive biases. But I don’t think we’ve quite figured out, even for these textbook problems, the inductive bias that lets a linguist accept the plausible grammars and reject the ridiculous ones,” he adds.

    “This work opens up many exciting venues for future research. I am particularly intrigued by the possibility that the approach explored by Ellis and colleagues (Bayesian Program Learning, BPL) might speak to how infants acquire language,” says T. Florian Jaeger, a professor of brain and cognitive sciences and computer science at the University of Rochester, who was not an author of this paper. “Future work might ask, for example, under what additional induction biases (assumptions about universal grammar) the BPL approach can successfully achieve human-like learning behavior on the type of data infants observe during language acquisition. I think it would be fascinating to see whether inductive biases that are even more abstract than those considered by Ellis and his team — such as biases originating in the limits of human information processing (e.g., memory constraints on dependency length or capacity limits in the amount of information that can be processed per time) — would be sufficient to induce some of the patterns observed in human languages.”

    This work was funded, in part, by the Air Force Office of Scientific Research, the Center for Brains, Minds, and Machines, the MIT-IBM Watson AI Lab, the Natural Science and Engineering Research Council of Canada, the Fonds de Recherche du Québec – Société et Culture, the Canada CIFAR AI Chairs Program, the National Science Foundation (NSF), and an NSF graduate fellowship. More

  • in

    Taking a magnifying glass to data center operations

    When the MIT Lincoln Laboratory Supercomputing Center (LLSC) unveiled its TX-GAIA supercomputer in 2019, it provided the MIT community a powerful new resource for applying artificial intelligence to their research. Anyone at MIT can submit a job to the system, which churns through trillions of operations per second to train models for diverse applications, such as spotting tumors in medical images, discovering new drugs, or modeling climate effects. But with this great power comes the great responsibility of managing and operating it in a sustainable manner — and the team is looking for ways to improve.

    “We have these powerful computational tools that let researchers build intricate models to solve problems, but they can essentially be used as black boxes. What gets lost in there is whether we are actually using the hardware as effectively as we can,” says Siddharth Samsi, a research scientist in the LLSC. 

    To gain insight into this challenge, the LLSC has been collecting detailed data on TX-GAIA usage over the past year. More than a million user jobs later, the team has released the dataset open source to the computing community.

    Their goal is to empower computer scientists and data center operators to better understand avenues for data center optimization — an important task as processing needs continue to grow. They also see potential for leveraging AI in the data center itself, by using the data to develop models for predicting failure points, optimizing job scheduling, and improving energy efficiency. While cloud providers are actively working on optimizing their data centers, they do not often make their data or models available for the broader high-performance computing (HPC) community to leverage. The release of this dataset and associated code seeks to fill this space.

    “Data centers are changing. We have an explosion of hardware platforms, the types of workloads are evolving, and the types of people who are using data centers is changing,” says Vijay Gadepally, a senior researcher at the LLSC. “Until now, there hasn’t been a great way to analyze the impact to data centers. We see this research and dataset as a big step toward coming up with a principled approach to understanding how these variables interact with each other and then applying AI for insights and improvements.”

    Papers describing the dataset and potential applications have been accepted to a number of venues, including the IEEE International Symposium on High-Performance Computer Architecture, the IEEE International Parallel and Distributed Processing Symposium, the Annual Conference of the North American Chapter of the Association for Computational Linguistics, the IEEE High-Performance and Embedded Computing Conference, and International Conference for High Performance Computing, Networking, Storage and Analysis. 

    Workload classification

    Among the world’s TOP500 supercomputers, TX-GAIA combines traditional computing hardware (central processing units, or CPUs) with nearly 900 graphics processing unit (GPU) accelerators. These NVIDIA GPUs are specialized for deep learning, the class of AI that has given rise to speech recognition and computer vision.

    The dataset covers CPU, GPU, and memory usage by job; scheduling logs; and physical monitoring data. Compared to similar datasets, such as those from Google and Microsoft, the LLSC dataset offers “labeled data, a variety of known AI workloads, and more detailed time series data compared with prior datasets. To our knowledge, it’s one of the most comprehensive and fine-grained datasets available,” Gadepally says. 

    Notably, the team collected time-series data at an unprecedented level of detail: 100-millisecond intervals on every GPU and 10-second intervals on every CPU, as the machines processed more than 3,000 known deep-learning jobs. One of the first goals is to use this labeled dataset to characterize the workloads that different types of deep-learning jobs place on the system. This process would extract features that reveal differences in how the hardware processes natural language models versus image classification or materials design models, for example.   

    The team has now launched the MIT Datacenter Challenge to mobilize this research. The challenge invites researchers to use AI techniques to identify with 95 percent accuracy the type of job that was run, using their labeled time-series data as ground truth.

    Such insights could enable data centers to better match a user’s job request with the hardware best suited for it, potentially conserving energy and improving system performance. Classifying workloads could also allow operators to quickly notice discrepancies resulting from hardware failures, inefficient data access patterns, or unauthorized usage.

    Too many choices

    Today, the LLSC offers tools that let users submit their job and select the processors they want to use, “but it’s a lot of guesswork on the part of users,” Samsi says. “Somebody might want to use the latest GPU, but maybe their computation doesn’t actually need it and they could get just as impressive results on CPUs, or lower-powered machines.”

    Professor Devesh Tiwari at Northeastern University is working with the LLSC team to develop techniques that can help users match their workloads to appropriate hardware. Tiwari explains that the emergence of different types of AI accelerators, GPUs, and CPUs has left users suffering from too many choices. Without the right tools to take advantage of this heterogeneity, they are missing out on the benefits: better performance, lower costs, and greater productivity.

    “We are fixing this very capability gap — making users more productive and helping users do science better and faster without worrying about managing heterogeneous hardware,” says Tiwari. “My PhD student, Baolin Li, is building new capabilities and tools to help HPC users leverage heterogeneity near-optimally without user intervention, using techniques grounded in Bayesian optimization and other learning-based optimization methods. But, this is just the beginning. We are looking into ways to introduce heterogeneity in our data centers in a principled approach to help our users achieve the maximum advantage of heterogeneity autonomously and cost-effectively.”

    Workload classification is the first of many problems to be posed through the Datacenter Challenge. Others include developing AI techniques to predict job failures, conserve energy, or create job scheduling approaches that improve data center cooling efficiencies.

    Energy conservation 

    To mobilize research into greener computing, the team is also planning to release an environmental dataset of TX-GAIA operations, containing rack temperature, power consumption, and other relevant data.

    According to the researchers, huge opportunities exist to improve the power efficiency of HPC systems being used for AI processing. As one example, recent work in the LLSC determined that simple hardware tuning, such as limiting the amount of power an individual GPU can draw, could reduce the energy cost of training an AI model by 20 percent, with only modest increases in computing time. “This reduction translates to approximately an entire week’s worth of household energy for a mere three-hour time increase,” Gadepally says.

    They have also been developing techniques to predict model accuracy, so that users can quickly terminate experiments that are unlikely to yield meaningful results, saving energy. The Datacenter Challenge will share relevant data to enable researchers to explore other opportunities to conserve energy.

    The team expects that lessons learned from this research can be applied to the thousands of data centers operated by the U.S. Department of Defense. The U.S. Air Force is a sponsor of this work, which is being conducted under the USAF-MIT AI Accelerator.

    Other collaborators include researchers at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). Professor Charles Leiserson’s Supertech Research Group is investigating performance-enhancing techniques for parallel computing, and research scientist Neil Thompson is designing studies on ways to nudge data center users toward climate-friendly behavior.

    Samsi presented this work at the inaugural AI for Datacenter Optimization (ADOPT’22) workshop last spring as part of the IEEE International Parallel and Distributed Processing Symposium. The workshop officially introduced their Datacenter Challenge to the HPC community.

    “We hope this research will allow us and others who run supercomputing centers to be more responsive to user needs while also reducing the energy consumption at the center level,” Samsi says. More

  • in

    New leadership at MIT’s Center for Biomedical Innovation

    As it continues in its mission to improve global health through the development and implementation of biomedical innovation, the MIT Center for Biomedical Innovation (CBI) today announced changes to its leadership team: Stacy Springs has been named executive director, and Professor Richard Braatz has joined as the center’s new associate faculty director.

    The change in leadership comes at a time of rapid development in new therapeutic modalities, growing concern over global access to biologic medicines and healthy food, and widespread interest in applying computational tools and multi-disciplinary approaches to address long-standing biomedical challenges.

    “This marks an exciting new chapter for the CBI,” says faculty director Anthony J. Sinskey, professor of biology, who cofounded CBI in 2005. “As I look back at almost 20 years of CBI history, I see an exponential growth in our activities, educational offerings, and impact.”

    The center’s collaborative research model accelerates innovation in biotechnology and biomedical research, drawing on the expertise of faculty and researchers in MIT’s schools of Engineering and Science, the MIT Schwarzman College of Computing, and the MIT Sloan School of Management.

    Springs steps into the role of executive director having previously served as senior director of programs for CBI and as executive director of CBI’s Biomanufacturing Program and its Consortium on Adventitious Agent Contamination in Biomanufacturing (CAACB). She succeeds Gigi Hirsch, who founded the NEW Drug Development ParadIGmS (NEWDIGS) Initiative at CBI in 2009. Hirsch and NEWDIGS have now moved to Tufts Medical Center, establishing a headquarters at the new Center for Biomedical System Design within the Institute for Clinical Research and Health Policy Studies there.

    Braatz, a chemical engineer whose work is informed by mathematical modeling and computational techniques, conducts research in process data analytics, design, and control of advanced manufacturing systems.

    “It’s been great to interact with faculty from across the Institute who have complementary expertise,” says Braatz, the Edwin R. Gilliland Professor in the Department of Chemical Engineering. “Participating in CBI’s workshops has led to fruitful partnerships with companies in tackling industry-wide challenges.”

    CBI is housed under the Institute for Data Systems and Society and, specifically, the Sociotechnical Systems Research Center in the MIT Schwarzman College of Computing. CBI is home to two biomanufacturing consortia: the CAACB and the Biomanufacturing Consortium (BioMAN). Through these precompetitive collaborations, CBI researchers work with biomanufacturers and regulators to advance shared interests in biomanufacturing.

    In addition, CBI researchers are engaged in several sponsored research programs focused on integrated continuous biomanufacturing capabilities for monoclonal antibodies and vaccines, analytical technologies to measure quality and safety attributes of a variety of biologics, including gene and cell therapies, and rapid-cycle development of virus-like particle vaccines for SARS-CoV-2.

    In another significant initiative, CBI researchers are applying data analytics strategies to biomanufacturing problems. “In our smart data analytics project, we are creating new decision support tools and algorithms for biomanufacturing process control and plant-level decision-making. Further, we are leveraging machine learning and natural language processing to improve post-market surveillance studies,” says Springs.

    CBI is also working on advanced manufacturing for cell and gene therapies, among other new modalities, and is a part of the Singapore-MIT Alliance for Research and Technology – Critical Analytics for Manufacturing Personalized-Medicine (SMART CAMP). SMART CAMP is an international research effort focused on developing the analytical tools and biological understanding of critical quality attributes that will enable the manufacture and delivery of improved cell therapies to patients.

    “This is a crucial time for biomanufacturing and for innovation across the health-care value chain. The collaborative efforts of MIT researchers and consortia members will drive fundamental discovery and inform much-needed progress in industry,” says MIT Vice President for Research Maria Zuber.

    “CBI has a track record of engaging with health-care ecosystem challenges. I am confident that under the new leadership, it will continue to inspire MIT, the United States, and the entire world to improve the health of all people,” adds Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing. More

  • in

    Caspar Hare, Georgia Perakis named associate deans of Social and Ethical Responsibilities of Computing

    Caspar Hare and Georgia Perakis have been appointed the new associate deans of the Social and Ethical Responsibilities of Computing (SERC), a cross-cutting initiative in the MIT Stephen A. Schwarzman College of Computing. Their new roles will take effect on Sept. 1.

    “Infusing social and ethical aspects of computing in academic research and education is a critical component of the college mission,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Ellis Warren Professor of Electrical Engineering and Computer Science. “I look forward to working with Caspar and Georgia on continuing to develop and advance SERC and its reach across MIT. Their complementary backgrounds and their broad connections across MIT will be invaluable to this next chapter of SERC.”

    Caspar Hare

    Hare is a professor of philosophy in the Department of Linguistics and Philosophy. A member of the MIT faculty since 2003, his main interests are in ethics, metaphysics, and epistemology. The general theme of his recent work has been to bring ideas about practical rationality and metaphysics to bear on issues in normative ethics and epistemology. He is the author of two books: “On Myself, and Other, Less Important Subjects” (Princeton University Press 2009), about the metaphysics of perspective, and “The Limits of Kindness” (Oxford University Press 2013), about normative ethics.

    Georgia Perakis

    Perakis is the William F. Pounds Professor of Management and professor of operations research, statistics, and operations management at the MIT Sloan School of Management, where she has been a faculty member since 1998. She investigates the theory and practice of analytics and its role in operations problems and is particularly interested in how to solve complex and practical problems in pricing, revenue management, supply chains, health care, transportation, and energy applications, among other areas. Since 2019, she has been the co-director of the Operations Research Center, an interdepartmental PhD program that jointly reports to MIT Sloan and the MIT Schwarzman College of Computing, a role in which she will remain. Perakis will also assume an associate dean role at MIT Sloan in recognition of her leadership.

    Hare and Perakis succeed David Kaiser, the Germeshausen Professor of the History of Science and professor of physics, and Julie Shah, the H.N. Slater Professor of Aeronautics and Astronautics, who will be stepping down from their roles at the conclusion of their three-year term on Aug. 31.

    “My deepest thanks to Dave and Julie for their tremendous leadership of SERC and contributions to the college as associate deans,” says Huttenlocher.

    SERC impact

    As the inaugural associate deans of SERC, Kaiser and Shah have been responsible for advancing a mission to incorporate humanist, social science, social responsibility, and civic perspectives into MIT’s teaching, research, and implementation of computing. In doing so, they have engaged dozens of faculty members and thousands of students from across MIT during these first three years of the initiative.

    They have brought together people from a broad array of disciplines to collaborate on crafting original materials such as active learning projects, homework assignments, and in-class demonstrations. A collection of these materials was recently published and is now freely available to the world via MIT OpenCourseWare.

    In February 2021, they launched the MIT Case Studies in Social and Ethical Responsibilities of Computing for undergraduate instruction across a range of classes and fields of study. The specially commissioned and peer-reviewed cases are based on original research and are brief by design. Three issues have been published to date and a fourth will be released later this summer. Kaiser will continue to oversee the successful new series as editor.

    Last year, 60 undergraduates, graduate students, and postdocs joined a community of SERC Scholars to help advance SERC efforts in the college. The scholars participate in unique opportunities throughout, such as the summer Experiential Ethics program. A multidisciplinary team of graduate students last winter worked with the instructors and teaching assistants of class 6.036 (Introduction to Machine Learning), MIT’s largest machine learning course, to infuse weekly labs with material covering ethical computing, data and model bias, and fairness in machine learning through SERC.

    Through efforts such as these, SERC has had a substantial impact at MIT and beyond. Over the course of their tenure, Kaiser and Shah have engaged about 80 faculty members, and more than 2,100 students took courses that included new SERC content in the last year alone. SERC’s reach extended well beyond engineering students, with about 500 exposed to SERC content through courses offered in the School of Humanities, Arts, and Social Sciences, the MIT Sloan School of Management, and the School of Architecture and Planning. More

  • in

    A technique to improve both fairness and accuracy in artificial intelligence

    For workers who use machine-learning models to help them make decisions, knowing when to trust a model’s predictions is not always an easy task, especially since these models are often so complex that their inner workings remain a mystery.

    Users sometimes employ a technique, known as selective regression, in which the model estimates its confidence level for each prediction and will reject predictions when its confidence is too low. Then a human can examine those cases, gather additional information, and make a decision about each one manually.

    But while selective regression has been shown to improve the overall performance of a model, researchers at MIT and the MIT-IBM Watson AI Lab have discovered that the technique can have the opposite effect for underrepresented groups of people in a dataset. As the model’s confidence increases with selective regression, its chance of making the right prediction also increases, but this does not always happen for all subgroups.

    For instance, a model suggesting loan approvals might make fewer errors on average, but it may actually make more wrong predictions for Black or female applicants. One reason this can occur is due to the fact that the model’s confidence measure is trained using overrepresented groups and may not be accurate for these underrepresented groups.

    Once they had identified this problem, the MIT researchers developed two algorithms that can remedy the issue. Using real-world datasets, they show that the algorithms reduce performance disparities that had affected marginalized subgroups.

    “Ultimately, this is about being more intelligent about which samples you hand off to a human to deal with. Rather than just minimizing some broad error rate for the model, we want to make sure the error rate across groups is taken into account in a smart way,” says senior MIT author Greg Wornell, the Sumitomo Professor in Engineering in the Department of Electrical Engineering and Computer Science (EECS) who leads the Signals, Information, and Algorithms Laboratory in the Research Laboratory of Electronics (RLE) and is a member of the MIT-IBM Watson AI Lab.

    Joining Wornell on the paper are co-lead authors Abhin Shah, an EECS graduate student, and Yuheng Bu, a postdoc in RLE; as well as Joshua Ka-Wing Lee SM ’17, ScD ’21 and Subhro Das, Rameswar Panda, and Prasanna Sattigeri, research staff members at the MIT-IBM Watson AI Lab. The paper will be presented this month at the International Conference on Machine Learning.

    To predict or not to predict

    Regression is a technique that estimates the relationship between a dependent variable and independent variables. In machine learning, regression analysis is commonly used for prediction tasks, such as predicting the price of a home given its features (number of bedrooms, square footage, etc.) With selective regression, the machine-learning model can make one of two choices for each input — it can make a prediction or abstain from a prediction if it doesn’t have enough confidence in its decision.

    When the model abstains, it reduces the fraction of samples it is making predictions on, which is known as coverage. By only making predictions on inputs that it is highly confident about, the overall performance of the model should improve. But this can also amplify biases that exist in a dataset, which occur when the model does not have sufficient data from certain subgroups. This can lead to errors or bad predictions for underrepresented individuals.

    The MIT researchers aimed to ensure that, as the overall error rate for the model improves with selective regression, the performance for every subgroup also improves. They call this monotonic selective risk.

    “It was challenging to come up with the right notion of fairness for this particular problem. But by enforcing this criteria, monotonic selective risk, we can make sure the model performance is actually getting better across all subgroups when you reduce the coverage,” says Shah.

    Focus on fairness

    The team developed two neural network algorithms that impose this fairness criteria to solve the problem.

    One algorithm guarantees that the features the model uses to make predictions contain all information about the sensitive attributes in the dataset, such as race and sex, that is relevant to the target variable of interest. Sensitive attributes are features that may not be used for decisions, often due to laws or organizational policies. The second algorithm employs a calibration technique to ensure the model makes the same prediction for an input, regardless of whether any sensitive attributes are added to that input.

    The researchers tested these algorithms by applying them to real-world datasets that could be used in high-stakes decision making. One, an insurance dataset, is used to predict total annual medical expenses charged to patients using demographic statistics; another, a crime dataset, is used to predict the number of violent crimes in communities using socioeconomic information. Both datasets contain sensitive attributes for individuals.

    When they implemented their algorithms on top of a standard machine-learning method for selective regression, they were able to reduce disparities by achieving lower error rates for the minority subgroups in each dataset. Moreover, this was accomplished without significantly impacting the overall error rate.

    “We see that if we don’t impose certain constraints, in cases where the model is really confident, it could actually be making more errors, which could be very costly in some applications, like health care. So if we reverse the trend and make it more intuitive, we will catch a lot of these errors. A major goal of this work is to avoid errors going silently undetected,” Sattigeri says.

    The researchers plan to apply their solutions to other applications, such as predicting house prices, student GPA, or loan interest rate, to see if the algorithms need to be calibrated for those tasks, says Shah. They also want to explore techniques that use less sensitive information during the model training process to avoid privacy issues.

    And they hope to improve the confidence estimates in selective regression to prevent situations where the model’s confidence is low, but its prediction is correct. This could reduce the workload on humans and further streamline the decision-making process, Sattigeri says.

    This research was funded, in part, by the MIT-IBM Watson AI Lab and its member companies Boston Scientific, Samsung, and Wells Fargo, and by the National Science Foundation. More

  • in

    Costis Daskalakis appointed inaugural Avanessians Professor in the MIT Schwarzman College of Computing

    The MIT Stephen A. Schwarzman College of Computing has named Costis Daskalakis as the inaugural holder of the Avanessians Professorship. His chair began on July 1.

    Daskalakis is the first person appointed to this position generously endowed by Armen Avanessians ’81. Established in the MIT Schwarzman College of Computing, the new chair provides Daskalakis with additional support to pursue his research and develop his career.

    “I’m delighted to recognize Costis for his scholarship and extraordinary achievements with this distinguished professorship,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Ellis Warren Professor of Electrical Engineering and Computer Science.

    A professor in the MIT Department of Electrical Engineering and Computer Science, Daskalakis is a theoretical computer scientist who works at the interface of game theory, economics, probability theory, statistics, and machine learning. He has resolved long-standing open problems about the computational complexity of the Nash equilibrium, the mathematical structure and computational complexity of multi-item auctions, and the behavior of machine-learning methods such as the expectation-maximization algorithm. He has obtained computationally and statistically efficient methods for statistical hypothesis testing and learning in high-dimensional settings, as well as results characterizing the structure and concentration properties of high-dimensional distributions. His current work focuses on multi-agent learning, learning from biased and dependent data, causal inference, and econometrics.

    A native of Greece, Daskalakis joined the MIT faculty in 2009. He is a member of the Computer Science and Artificial Intelligence Laboratory and is affiliated with the Laboratory for Information and Decision Systems and the Operations Research Center. He is also an investigator in the Foundations of Data Science Institute.

    He has previously received such honors as the 2018 Nevanlinna Prize from the International Mathematical Union, the 2018 ACM Grace Murray Hopper Award, the Kalai Game Theory and Computer Science Prize from the Game Theory Society, and the 2008 ACM Doctoral Dissertation Award. More