More stories

  • in

    Celebrating open data

    The inaugural MIT Prize for Open Data, which included a $2,500 cash prize, was recently awarded to 10 individual and group research projects. Presented jointly by the School of Science and the MIT Libraries, the prize recognizes MIT-affiliated researchers who make their data openly accessible and reusable by others. The prize winners and 16 honorable mention recipients were honored at the Open Data @ MIT event held Oct. 28 at Hayden Library. 

    “By making data open, researchers create opportunities for novel uses of their data and for new insights to be gleaned,” says Chris Bourg, director of MIT Libraries. “Open data accelerates scholarly progress and discovery, advances equity in scholarly participation, and increases transparency, replicability, and trust in science.” 

    Recognizing shared values

    Spearheaded by Bourg and Rebecca Saxe, associate dean of the School of Science and John W. Jarve (1978) Professor of Brain and Cognitive Sciences, the MIT Prize for Open Data was launched to highlight the value of open data at MIT and to encourage the next generation of researchers. Nominations were solicited from across the Institute, with a focus on trainees: research technicians, undergraduate or graduate students, or postdocs.

    “By launching an MIT-wide prize and event, we aimed to create visibility for the scholars who create, use, and advocate for open data,” says Saxe. “Highlighting this research and creating opportunities for networking would also help open-data advocates across campus find each other.” 

    Recognizing researchers who share data was also one of the recommendations of the Ad Hoc Task Force on Open Access to MIT’s Research, which Bourg co-chaired with Hal Abelson, Class of 1922 Professor, Department of Electrical Engineering and Computer Science. An annual award was one of the strategies put forth by the task force to further the Institute’s mission to disseminate the fruits of its research and scholarship as widely as possible.

    Strong competition

    Winners and honorable mentions were chosen from more than 70 nominees, representing all five schools, the MIT Schwarzman College of Computing, and several research centers across MIT. A committee composed of faculty, staff, and a graduate student made the selections:

    Yunsie Chung, graduate student in the Department of Chemical Engineering, won for SolProp, the largest open-source dataset with temperature-dependent solubility values of organic compounds. 
    Matthew Groh, graduate student, MIT Media Lab, accepted on behalf of the team behind the Fitzpatrick 17k dataset, an open dataset consisting of nearly 17,000 images of skin disease alongside skin disease and skin tone annotations. 
    Tom Pollard, research scientist at the Institute for Medical Engineering and Science, accepted on behalf of the PhysioNet team. This data-sharing platform enables thousands of clinical and machine-learning research studies each year and allows researchers to share sensitive resources that would not be possible through typical data sharing platforms. 
    Joseph Replogle, graduate student with the Whitehead Institute for Biomedical Research, was recognized for the Genome-wide Perturb-seq dataset, the largest publicly available, single-cell transcriptional dataset collected to date. 
    Pedro Reynolds-Cuéllar, graduate student with the MIT Media Lab/Art, Culture, and Technology, and Diana Duarte, co-founder at Diversa, won for Retos, an open-data platform for detailed documentation and sharing of local innovations from under-resourced settings. 
    Maanas Sharma, an undergraduate student, led States of Emergency, a nationwide project analyzing and grading the responses of prison systems to Covid-19 using data scraped from public databases and manually collected data. 
    Djuna von Maydell, graduate student in the Department of Brain and Cognitive Sciences, created the first publicly available dataset of single-cell gene expression from postmortem human brain tissue of patients who are carriers of APOE4, the major Alzheimer’s disease risk gene. 
    Raechel Walker, graduate researcher in the MIT Media Lab, and her collaborators created a Data Activism Curriculum for high school students through the Mayor’s Summer Youth Employment Program in Cambridge, Massachusetts. Students learned how to use data science to recognize, mitigate, and advocate for people who are disproportionately impacted by systemic inequality. 
    Suyeol Yun, graduate student in the Department of Political Science, was recognized for DeepWTO, a project creating open data for use in legal natural language processing research using cases from the World Trade Organization. 
    Jonathan Zheng, graduate student in the Department of Chemical Engineering, won for an open IUPAC dataset for acid dissociation constants, or “pKas,” physicochemical properties that govern how acidic a chemical is in a solution.
    A full list of winners and honorable mentions is available on the Open Data @ MIT website.

    A campus-wide celebration

    Awards were presented at a celebratory event held in the Nexus in Hayden Library during International Open Access Week. School of Science Dean Nergis Mavalvala kicked off the program by describing the long and proud history of open scholarship at MIT, citing the Institute-wide faculty open access policy and the launch of the open-source digital repository DSpace. “When I was a graduate student, we were trying to figure out how to share our theses during the days of the nascent internet,” she said, “With DSpace, MIT was figuring it out for us.” 

    The centerpiece of the program was a series of five-minute presentations from the prize winners on their research. Presenters detailed the ways they created, used, or advocated for open data, and the value that openness brings to their respective fields. Winner Djuna von Maydell, a graduate student in Professor Li-Huei Tsai’s lab who studies the genetic causes of neurodegeneration, underscored why it is important to share data, particularly data obtained from postmortem human brains. 

    “This is data generated from human brains, so every data point stems from a living, breathing human being, who presumably made this donation in the hope that we would use it to advance knowledge and uncover truth,” von Maydell said. “To maximize the probability of that happening, we have to make it available to the scientific community.” 

    MIT community members who would like to learn more about making their research data open can consult MIT Libraries’ Data Services team.  More

  • in

    Urbanization: No fast lane to transformation

    Accra, Ghana, “is a city I’ve come to know as well as any place in the U.S,” says Associate Professor Noah Nathan, who has conducted research there over the past 15 years. The booming capital of 4 million is an ideal laboratory for investigating the rapid urbanization of nations in Africa and beyond, believes Nathan, who joined the MIT Department of Political Science in July.

    “Accra is vibrant and exciting, with gleaming glass office buildings, shopping centers, and an emerging middle class,” he says. “But at the same time there is enormous poverty, with slums and a mixing pot of ethnic groups.” Cities like Accra that have emerged in developing countries around the world are “hybrid spaces” that provoke a multitude of questions for Nathan.

    “Rich and poor are in incredibly close proximity and I want to know how this dramatic inequality can be sustainable, and what politics looks like with such ethnic and class diversity living side-by-side,” he says.

    With his singular approach to data collection and deep understanding of Accra, its neighborhoods, and increasingly, its built environment, Nathan is generating a body of scholarship on the political impacts of urbanization throughout the global South.

    A trap in the urban transition

    Nathan’s early studies of Accra challenged common expectations about how urbanization shifts political behavior.

    “Modernization theory states that as people become more ‘modern’ and move to cities, ethnicity fades and class becomes the dominant dynamic in political behavior,” explains Nathan. “It predicts that the process of urbanization transforms the relationship between politicians and voters, and elections become more ideologically and policy oriented,” says Nathan.  

    But in Accra, the heart of one of the fastest-growing economies in the developing world, Nathan found “a type of politics stuck in an old equilibrium, hard to dislodge, and not updated by newly wealthy voters,” he says. Using census data revealing the demographic composition of every neighborhood in Accra, Nathan determined that there were many enclaves in which forms of patronage politics and ethnic competition persist. He conducted sample surveys and collected polling-station level results on residents’ voting across the city. “I was able to merge spatial data on where people lived and their answers to survey questions, and determine how different neighborhoods voted,” says Nathan.

    Among his findings: Ethnic politics were thriving in many parts of Accra, and many middle-class voters were withdrawing from politics entirely in reaction to the well-established practice of patronage rather than pressuring politicians to change their approach. “They decided it was better to look out for themselves,” he explains.

    In Nathan’s 2019 book, “Electoral Politics and Africa’s Urban Transition: Class and Ethnicity in Ghana,” he described this situation as a trap. “As the wealthy exit from the state, politicians double down on patronage politics with poor voters, which the middle class views as further evidence of corruption,” he explains. The wealthier citizens “want more public goods, and big policy reforms, such as changes in the health-care and tax systems, while poor voters focus on immediate needs such as jobs, homes, better schools in their communities.”

    In Ghana and other developing countries where the state’s capacity is limited, politicians can’t deliver on the broad-scale changes desired by the middle class. Motivated by their own political survival, they continue dealing with poor voters as clients, trading services for votes. “I connect urban politics in Ghana to the early 20th-century urban machines in the United States, run by party bosses,” says Nathan.

    This may prove sobering news for many engaged with the developing world. “There’s enormous enthusiasm among foreign aid organizations, in the popular press and policy circles, for the idea that urbanization will usher in big, radical political change,” notes Nathan. “But these kinds of transformations will only come about with structural change such as civil service reforms and nonpartisan welfare programs that can push politicians beyond just delivering targeted services to poor voters.”

    Falling in love with Ghana

    For most of his youth, Nathan was a committed jazz saxophonist, toying with going professional. But he had long cultivated another fascination as well. “I was a huge fan of ‘The West Wing’ in middle school” and got into American politics through that,” he says. He volunteered in Hillary Clinton’s 2008 primary campaign during college, but soon realized work in politics was “both more boring and not as idealistic” as he’d hoped.

    As an undergraduate at Harvard University, where he concentrated in government, he “signed up for African history on a lark — because American high schools didn’t teach anything on the subject — and I loved it,” Nathan says. He took another African history course, and then found his way to classes taught by Harvard political scientist Robert H. Bates PhD ’69 that focused on the political economy of development, ethnic conflict, and state failure in Africa. In the summer before his senior year, he served as a research assistant for one of his professors in Ghana, and then stayed longer, hoping to map out a senior thesis on ethnic conflict.

    “Once I got to Ghana, I was fascinated by the place — the dynamism of this rapidly transforming society,” he recalls. “Growing up in the U.S., there are a lot of stereotypes about the developing world, and I quickly realized how much more complicated everything is.”

    These initial experiences living in Ghana shaped Nathan’s ideas for what became his doctoral dissertation at Harvard and first book on the ethnic and class dynamics driving the nation’s politics. His frequent return visits to that country sparked a wealth of research that built on and branched out from this work.

    One set of studies examines the historical development of Ghana’s rural north in its colonial and post-colonial periods, the center of ethnic conflict in the 1990s. These are communities “where the state delivers few resources, doesn’t seem to do much, yet figures as a central actor in people’s lives,” he says.

    Part of this region had been a German colony, and the other part was originally under British rule, and Nathan compared the political trajectories of these two areas, focusing on differences in early state efforts to impose new forms of local political leadership and gradually build a formal education system.

    “The colonial legacy in the British areas was elite families who came to dominate, entrenching themselves and creating political dynasties and economic inequality,” says Nathan. But similar ethnic groups exposed to different state policies in the original German colony were not riven with the same class inequalities, and enjoy better access to government services today. “This research is changing how we think about state weakness in the developing world, how we tend to see the emergence of inequality where societal elites come into power,” he says. The results of Nathan’s research will be published in a forthcoming book, “The Scarce State: Inequality and Political Power in the Hinterland.”

    Politics of built spaces

    At MIT, Nathan is pivoting to a fresh new framing for questions on urbanization. Wielding a public source map of cities around the world, he is scrutinizing the geometry of street grids in 1,000 of sub-Saharan Africa’s largest cities “to think about urban order,” he says. Digitizing historical street maps of African cities from the Library of Congress’s map collection, he can look at how these cities were built and evolved physically. “When cities emerge based on grids, rather than tangles, they are more legible to governments,” he says. “This means that it’s easier to find people, easier to govern, tax, repress, and politically mobilize them.”  

    Nathan has begun to demonstrate that in the post-colonial period, “cities that were built under authoritarian regimes tend to be most legible, with even low-capacity regimes trying to impose control and make them gridded.” Democratic governments, he says, “lead to more tangled and chaotic built environments, with people doing what they want.” He also draws comparisons to how state policies shaped urban growth in the United States, with local and federal governments exerting control over neighborhood development, leading to redlining and segregation in many cities.

    Nathan’s interests naturally pull him toward the MIT Governance Lab and Global Diversity Lab. “I’m hoping to dive into both,” he says. “One big attraction of the department is the really interesting research that’s being done on developing countries.”  He also plans to use the stature he has built over many years of research in Africa to help “open doors” to African researchers and students, who may not always get the same kind of access to institutions and data that he has had. “I’m hoping to build connections to researchers in the global South,” he says. More

  • in

    Ad hoc committee releases report on remote teaching best practices for on-campus education

    The Ad Hoc Committee on Leveraging Best Practices from Remote Teaching for On-Campus Education has released a report that captures how instructors are weaving lessons learned from remote teaching into in-person classes. Despite the challenges imposed by teaching and learning remotely during the Covid-19 pandemic, the report says, “there were seeds planted then that, we hope, will bear fruit in the coming years.”

    “In the long run, one of the best things about having lived through our remote learning experience may be the intense and broad focus on pedagogy that it necessitated,” the report continues. “In a moment when nobody could just teach the way they had always done before, all of us had to go back to first principles and ask ourselves: What are our learning goals for our students? How can we best help them to achieve these goals?”

    The committee’s work is a direct response to one of the Refinement and Implementation Committees (RIC) formed as part of Task Force 2021 and Beyond. Led by co-chairs Krishna Rajagopal, the William A. M. Burden Professor of Physics, and Janet Rankin, director of the MIT Teaching + Learning Lab, the committee engaged with faculty and instructional staff, associate department heads, and undergraduate and graduate officers across MIT.

    The findings are distilled into four broad themes:

    Community, Well-being, and Belonging. Conversations revealed new ways that instructors cultivated these key interrelated concepts, all of which are fundamental to student learning and success. Many instructors focused more on supporting well-being and building community and belonging during the height of the pandemic precisely because the MIT community, and everyone in it, was under such great stress. Some of the resulting practices are continuing, the committee found. Examples include introducing simple gestures, such as start-of-class welcoming practices, and providing extensions and greater flexibility on student assignments. Also, many across MIT felt that the week-long Thanksgiving break offered in 2020 should become a permanent fixture in the academic calendar, because it enhances the well-being of both students and instructors at a time in the fall semester when everyone’s batteries need recharging. 
    Enhancing Engagement. The committee found a variety of practices that have enhanced engagement between students and instructors; among students; and among instructors. For example, many instructors have continued to offer some office hours on Zoom, which seems to reduce barriers to participation for many students, while offering in-person office hours for those who want to take advantage of opportunities for more open-ended conversations. Several departments increased their usage of undergraduate teaching assistants (UTAs) in ways that make students’ learning experience more engaging and give the UTAs a real teaching experience. In addition, many instructors are leveraging out-of-class communication spaces like Slack, Perusall, and Piazza so students can work together, ask questions, and share ideas. 
    Enriching and Augmenting the Learning Environment. The report presents two ways in which instructors have enhanced learning within the classroom: through blended learning and by incorporating authentic experiences. Although blended learning techniques are not new at MIT, after having made it through remote teaching many faculty have found new ways to combine synchronous in-person teaching with asynchronous activities for on-campus students, such as pre-class or pre-lab sequences of videos with exercises interspersed, take-home lab kits, auto-graded online problems that give students immediate feedback, and recorded lab experiences for subsequent review. In addition, instructors found many creative ways to make students’ learning more authentic by going on virtual field trips, using Zoom to bring experts from around the world into MIT classrooms or to enable interactions with students at other universities, and live-streaming experiments that students could not otherwise experience since they cannot be performed in a teaching lab.   
     Assessing Learning. For all its challenges, the report notes, remote teaching prompted instructors to take a step back and think about what they wanted students to learn, how to support it, and how to measure it. The committee found a variety of examples of alternatives to traditional assessments, such as papers or timed, written exams, that instructors tried during the pandemic and are continuing to use. These alternatives include shorter, more frequent, lower-stakes assessments; oral exams or debates; asynchronous, open-book/notes exams; virtual poster sessions; alternate grading schemes; and uploading paper psets and exams into Gradescope to use its logistics and rubrics to improve grading effectiveness and efficiency.
    A large portion of the report is devoted to an extensive, annotated list of best practices from remote instruction that are being used in the classroom. Interestingly, Rankin says, “so many of the strategies and practices developed and used during the pandemic are based on, and supported by, solid educational research.”

    The report concludes with one broad recommendation: that all faculty and instructors read the findings and experiment with some of the best practices in their own instruction. “Our hope is that the practices shared in the report will continue to be adopted, adapted, and expanded by members of the teaching community at MIT, and that instructors’ openness in sharing and learning from each will continue,” Rankin says.

    Two additional, specific recommendations are included in the report. First, the committee endorses the RIC 16 recommendation that a Classroom Advisory Board be created to provide strategic input grounded in evolving pedagogy about future classroom use and technology needs. In its conversations, the committee found a number of ways that remote teaching and learning have impacted students’ and instructors’ perceptions as they have returned to the classroom. For example, during the pandemic students benefited from being able to see everyone else’s faces on Zoom. As a result, some instructors would prefer classrooms that enable students to face each other, such as semi-circular classrooms instead of rectangular ones.

    More generally, the committee concluded, MIT needs classrooms with seats and tables that can be quickly and flexibly reconfigured to facilitate varying pedagogical objectives. The Classroom Advisory Board could also examine classroom technology; this includes the role of videoconferencing to create authentic engagement between MIT students and people far from campus, and blended learning that allows students to experience more of the in-classroom engagement with their peers and instructors from which the “magic of MIT” originates.

    Second, the committee recommends that an implementation group be formed to investigate the possibility of changing the MIT academic calendar to create a one-week break over Thanksgiving. “Finalizing an implementation plan will require careful consideration of various significant logistical challenges,” the report says. “However, the resulting gains to both well-being and learning from this change to the fall calendar make doing so worthwhile.”

    Rankin notes that the report findings dovetail with the recently released MIT Strategic Action Plan for Belonging, Achievement and Composition. “I believe that one of the most important things that became really apparent during remote teaching was that community, inclusion, and belonging really matter and are necessary for both learning and teaching, and that instructors can and should play a central role in creating structures and processes to support them in their classrooms and other learning environments,” she says.

    Rajagopal finds it inspiring that “during a time of intense stress — that nobody ever wants to relive — there was such an intense focus on how we teach and how our students learn that, today, in essentially every direction we look we see colleagues improving on-campus education for tomorrow. I hope that the report will help instructors across the Institute, and perhaps elsewhere, learn from each other. Its readers will see, as our committee did, new ways in which students and instructors are finding those moments, those interactions, where the magic of MIT is created.”

    In addition to the report, the co-chairs recommend two other valuable remote teaching resources: a video interview series, TLL’s Fresh Perspectives, and Open Learning’s collection of examples of how MIT faculty and instructors leveraged digital technology to support and transform teaching and learning during the heart of the pandemic. More

  • in

    Making each vote count

    Graduate student Jacob Jaffe wants to improve the administration of American elections. To do that, he is posing “questions in political science that we haven’t been asking enough,” he says, “and solving them with methods we haven’t been using enough.”

    Considerable research has been devoted to understanding “who votes, and what makes people vote or not vote,” says Jaffe. He is training his attention on questions of a different nature: Does providing practical information to voters about how to cast their ballots change how they will vote? Is it possible to increase the accuracy of vote-counting, on a state-by-state and even precinct-by-precinct basis? How do voters experience polling places? These problems form the core of his dissertation.

    Taking advantage of the resources at the MIT Election Data and Science Lab, where he serves as a researcher, Jaffe conducts novel field experiments to gather highly detailed information on local, state, and federal elections, and analyzes this trove with advanced statistical techniques. Whether investigating the probability of miscounts in voting, or the possibility of changing a voter’s mode of voting, Jaffe intends to strengthen the scaffolding that supports representative government. “Elections are both theoretically and normatively important; they’re the basis of our belief in the moral rightness of the state to do the things the state does,” he says.

    Click this link

    For one of his keystone projects, Jaffe seized a unique opportunity to run a big field experiment. In summer 2020, at the height of the Covid-19 pandemic, he emailed 80,000 Floridians instructions on how to vote in an upcoming primary by mail. His email contained a link enabling recipients to fill out two simple questions to receive a ballot. “I wanted to learn if this was an effective method for getting people to vote by mail, and I proved it is, statistically,” he says. “This is important to know because if elections are held in times when we might need people to vote nonlocally or vote using one method over another — if they’re displaced by a hurricane or another emergency, for instance — I learned that we can effect a new vote mode practically and quickly.”

    One of Jaffe’s insights from this experiment is that “people do read their voting-related emails, but the content of the email has to be something they can act on proximately,” he says. “A message reminding them to vote two weeks from now is not so helpful.” The lower the burden on an individual to participate in voting, whether due to proximity to a polling site or instructions on how to receive and cast a ballot, the greater the likelihood of that person engaging in the election.

    “If we want people to vote by mail, we need to reduce the informational cost so it’s easier for voters to understand how the system works,” he says.

    Another significant research thrust for Jaffe involves scrutinizing accuracy in vote counting, using instances of recounts in presidential elections. Ensuring each vote counts, he says, “is one of the most fundamental questions in democracy,” he says.

    With access to 20 elections in 2020, Jaffe is comparing original vote totals for each candidate to the recounted, correct tally, on a precinct-level basis. “Using original combinatorial techniques, I can estimate the probability of miscounting ballots,” he says. The ultimate goal is to generate a granular picture of the efficacy of election administration across the country.

    “It varies a lot by state, and most states do a good job,” he says. States that take their time in counting perform better. “There’s a phenomenon where some towns race to get results in as quickly as possible, and this affects their accuracy.”

    In spite of the bright spots, Jaffe sees chronic underfunding of American elections. “We need to give local administrators the resources, the time and money to fund employees to do their jobs,” he says. The worse the situation is, “the more likely that elections will be called wrong, with no one knowing.” Jaffe believes that his analysis can offer states useful information for improving election administration. “Determining how good a place is historically at counting ballots can help determine the likelihood of needing costly recounts in future elections,” he says.

    The ballot box and beyond

    It didn’t take Jaffe long to decide on a life dedicated to studying politics. Part of a Boston-area family who, he says, “liked discussing what was going on in the world,” he had his own subscriptions to Time magazine at age 9, and to The Economist in middle school. During high school, he volunteered for then-Massachusetts Representative Barney Frank and Senator John Kerry, working on constituent services. At Rice University, he interned all four years with political scientist Robert M. Stein, an expert on voting and elections. With Stein’s help, Jaffe landed a position the summer before his senior year with the Department of Justice (DOJ), researching voting rights cases.

    “The experience was fascinating, and the work felt super important,” says Jaffe. His portfolio involved determining whether legal challenges to particular elections met the statistical standard for racial gerrymandering. “I had to answer hard quantitative questions about the relationship between race and voting in an area, and whether minority candidates were systematically prevented from winning,” he says.

    But while Jaffe cared a lot about this work, he didn’t feel adequately challenged. “As a 21-year-old at DOJ, I learned that I could address problems in the world using statistics,” he says. “But I felt I could have a greater impact addressing tougher questions outside of voting rights.”

    Jaffe was drawn to political science at MIT, and specifically to the research of Charles Stewart III, the Kenan Sahin Distinguished Professor of Political Science, director of the MIT Election Lab, and head of Jaffe’s thesis committee. It wasn’t just the opportunity to plumb the lab’s singular repository of voting data that attracted Jaffe, but its commitment to making every vote count. For Jaffe, this was a call to arms to investigate the many, and sometimes quotidian, obstacles, between citizens and ballot boxes.

    To this end, he has been analyzing, with the help of mathematical methods from queuing theory, why some elections involve wait lines of six hours and longer at polling sites. “We know that simpler ballots mean people move don’t get stuck in these lines, where they might potentially give up before voting,” he says. “Looking at the content of ballots and the interval between voter check-in and check-out, I learned that adding races, rather than candidates, to a ballot, means that people take more time completing ballots, leading to interminable lines.”

    A key takeaway from his ensemble of studies is that “while it’s relatively rare that elections are bad, we shouldn’t think that we’re good to go,” he says. “Instead, we need to be asking under what conditions do things get bad, and how can we make them better.” More

  • in

    AI that can learn the patterns of human language

    Human languages are notoriously complex, and linguists have long thought it would be impossible to teach a machine how to analyze speech sounds and word structures in the way human investigators do.

    But researchers at MIT, Cornell University, and McGill University have taken a step in this direction. They have demonstrated an artificial intelligence system that can learn the rules and patterns of human languages on its own.

    When given words and examples of how those words change to express different grammatical functions (like tense, case, or gender) in one language, this machine-learning model comes up with rules that explain why the forms of those words change. For instance, it might learn that the letter “a” must be added to end of a word to make the masculine form feminine in Serbo-Croatian.

    This model can also automatically learn higher-level language patterns that can apply to many languages, enabling it to achieve better results.

    The researchers trained and tested the model using problems from linguistics textbooks that featured 58 different languages. Each problem had a set of words and corresponding word-form changes. The model was able to come up with a correct set of rules to describe those word-form changes for 60 percent of the problems.

    This system could be used to study language hypotheses and investigate subtle similarities in the way diverse languages transform words. It is especially unique because the system discovers models that can be readily understood by humans, and it acquires these models from small amounts of data, such as a few dozen words. And instead of using one massive dataset for a single task, the system utilizes many small datasets, which is closer to how scientists propose hypotheses — they look at multiple related datasets and come up with models to explain phenomena across those datasets.

    “One of the motivations of this work was our desire to study systems that learn models of datasets that is represented in a way that humans can understand. Instead of learning weights, can the model learn expressions or rules? And we wanted to see if we could build this system so it would learn on a whole battery of interrelated datasets, to make the system learn a little bit about how to better model each one,” says Kevin Ellis ’14, PhD ’20, an assistant professor of computer science at Cornell University and lead author of the paper.

    Joining Ellis on the paper are MIT faculty members Adam Albright, a professor of linguistics; Armando Solar-Lezama, a professor and associate director of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Joshua B. Tenenbaum, the Paul E. Newton Career Development Professor of Cognitive Science and Computation in the Department of Brain and Cognitive Sciences and a member of CSAIL; as well as senior author

    Timothy J. O’Donnell, assistant professor in the Department of Linguistics at McGill University, and Canada CIFAR AI Chair at the Mila – Quebec Artificial Intelligence Institute.

    The research is published today in Nature Communications.

    Looking at language 

    In their quest to develop an AI system that could automatically learn a model from multiple related datasets, the researchers chose to explore the interaction of phonology (the study of sound patterns) and morphology (the study of word structure).

    Data from linguistics textbooks offered an ideal testbed because many languages share core features, and textbook problems showcase specific linguistic phenomena. Textbook problems can also be solved by college students in a fairly straightforward way, but those students typically have prior knowledge about phonology from past lessons they use to reason about new problems.

    Ellis, who earned his PhD at MIT and was jointly advised by Tenenbaum and Solar-Lezama, first learned about morphology and phonology in an MIT class co-taught by O’Donnell, who was a postdoc at the time, and Albright.

    “Linguists have thought that in order to really understand the rules of a human language, to empathize with what it is that makes the system tick, you have to be human. We wanted to see if we can emulate the kinds of knowledge and reasoning that humans (linguists) bring to the task,” says Albright.

    To build a model that could learn a set of rules for assembling words, which is called a grammar, the researchers used a machine-learning technique known as Bayesian Program Learning. With this technique, the model solves a problem by writing a computer program.

    In this case, the program is the grammar the model thinks is the most likely explanation of the words and meanings in a linguistics problem. They built the model using Sketch, a popular program synthesizer which was developed at MIT by Solar-Lezama.

    But Sketch can take a lot of time to reason about the most likely program. To get around this, the researchers had the model work one piece at a time, writing a small program to explain some data, then writing a larger program that modifies that small program to cover more data, and so on.

    They also designed the model so it learns what “good” programs tend to look like. For instance, it might learn some general rules on simple Russian problems that it would apply to a more complex problem in Polish because the languages are similar. This makes it easier for the model to solve the Polish problem.

    Tackling textbook problems

    When they tested the model using 70 textbook problems, it was able to find a grammar that matched the entire set of words in the problem in 60 percent of cases, and correctly matched most of the word-form changes in 79 percent of problems.

    The researchers also tried pre-programming the model with some knowledge it “should” have learned if it was taking a linguistics course, and showed that it could solve all problems better.

    “One challenge of this work was figuring out whether what the model was doing was reasonable. This isn’t a situation where there is one number that is the single right answer. There is a range of possible solutions which you might accept as right, close to right, etc.,” Albright says.

    The model often came up with unexpected solutions. In one instance, it discovered the expected answer to a Polish language problem, but also another correct answer that exploited a mistake in the textbook. This shows that the model could “debug” linguistics analyses, Ellis says.

    The researchers also conducted tests that showed the model was able to learn some general templates of phonological rules that could be applied across all problems.

    “One of the things that was most surprising is that we could learn across languages, but it didn’t seem to make a huge difference,” says Ellis. “That suggests two things. Maybe we need better methods for learning across problems. And maybe, if we can’t come up with those methods, this work can help us probe different ideas we have about what knowledge to share across problems.”

    In the future, the researchers want to use their model to find unexpected solutions to problems in other domains. They could also apply the technique to more situations where higher-level knowledge can be applied across interrelated datasets. For instance, perhaps they could develop a system to infer differential equations from datasets on the motion of different objects, says Ellis.

    “This work shows that we have some methods which can, to some extent, learn inductive biases. But I don’t think we’ve quite figured out, even for these textbook problems, the inductive bias that lets a linguist accept the plausible grammars and reject the ridiculous ones,” he adds.

    “This work opens up many exciting venues for future research. I am particularly intrigued by the possibility that the approach explored by Ellis and colleagues (Bayesian Program Learning, BPL) might speak to how infants acquire language,” says T. Florian Jaeger, a professor of brain and cognitive sciences and computer science at the University of Rochester, who was not an author of this paper. “Future work might ask, for example, under what additional induction biases (assumptions about universal grammar) the BPL approach can successfully achieve human-like learning behavior on the type of data infants observe during language acquisition. I think it would be fascinating to see whether inductive biases that are even more abstract than those considered by Ellis and his team — such as biases originating in the limits of human information processing (e.g., memory constraints on dependency length or capacity limits in the amount of information that can be processed per time) — would be sufficient to induce some of the patterns observed in human languages.”

    This work was funded, in part, by the Air Force Office of Scientific Research, the Center for Brains, Minds, and Machines, the MIT-IBM Watson AI Lab, the Natural Science and Engineering Research Council of Canada, the Fonds de Recherche du Québec – Société et Culture, the Canada CIFAR AI Chairs Program, the National Science Foundation (NSF), and an NSF graduate fellowship. More

  • in

    Caspar Hare, Georgia Perakis named associate deans of Social and Ethical Responsibilities of Computing

    Caspar Hare and Georgia Perakis have been appointed the new associate deans of the Social and Ethical Responsibilities of Computing (SERC), a cross-cutting initiative in the MIT Stephen A. Schwarzman College of Computing. Their new roles will take effect on Sept. 1.

    “Infusing social and ethical aspects of computing in academic research and education is a critical component of the college mission,” says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and the Henry Ellis Warren Professor of Electrical Engineering and Computer Science. “I look forward to working with Caspar and Georgia on continuing to develop and advance SERC and its reach across MIT. Their complementary backgrounds and their broad connections across MIT will be invaluable to this next chapter of SERC.”

    Caspar Hare

    Hare is a professor of philosophy in the Department of Linguistics and Philosophy. A member of the MIT faculty since 2003, his main interests are in ethics, metaphysics, and epistemology. The general theme of his recent work has been to bring ideas about practical rationality and metaphysics to bear on issues in normative ethics and epistemology. He is the author of two books: “On Myself, and Other, Less Important Subjects” (Princeton University Press 2009), about the metaphysics of perspective, and “The Limits of Kindness” (Oxford University Press 2013), about normative ethics.

    Georgia Perakis

    Perakis is the William F. Pounds Professor of Management and professor of operations research, statistics, and operations management at the MIT Sloan School of Management, where she has been a faculty member since 1998. She investigates the theory and practice of analytics and its role in operations problems and is particularly interested in how to solve complex and practical problems in pricing, revenue management, supply chains, health care, transportation, and energy applications, among other areas. Since 2019, she has been the co-director of the Operations Research Center, an interdepartmental PhD program that jointly reports to MIT Sloan and the MIT Schwarzman College of Computing, a role in which she will remain. Perakis will also assume an associate dean role at MIT Sloan in recognition of her leadership.

    Hare and Perakis succeed David Kaiser, the Germeshausen Professor of the History of Science and professor of physics, and Julie Shah, the H.N. Slater Professor of Aeronautics and Astronautics, who will be stepping down from their roles at the conclusion of their three-year term on Aug. 31.

    “My deepest thanks to Dave and Julie for their tremendous leadership of SERC and contributions to the college as associate deans,” says Huttenlocher.

    SERC impact

    As the inaugural associate deans of SERC, Kaiser and Shah have been responsible for advancing a mission to incorporate humanist, social science, social responsibility, and civic perspectives into MIT’s teaching, research, and implementation of computing. In doing so, they have engaged dozens of faculty members and thousands of students from across MIT during these first three years of the initiative.

    They have brought together people from a broad array of disciplines to collaborate on crafting original materials such as active learning projects, homework assignments, and in-class demonstrations. A collection of these materials was recently published and is now freely available to the world via MIT OpenCourseWare.

    In February 2021, they launched the MIT Case Studies in Social and Ethical Responsibilities of Computing for undergraduate instruction across a range of classes and fields of study. The specially commissioned and peer-reviewed cases are based on original research and are brief by design. Three issues have been published to date and a fourth will be released later this summer. Kaiser will continue to oversee the successful new series as editor.

    Last year, 60 undergraduates, graduate students, and postdocs joined a community of SERC Scholars to help advance SERC efforts in the college. The scholars participate in unique opportunities throughout, such as the summer Experiential Ethics program. A multidisciplinary team of graduate students last winter worked with the instructors and teaching assistants of class 6.036 (Introduction to Machine Learning), MIT’s largest machine learning course, to infuse weekly labs with material covering ethical computing, data and model bias, and fairness in machine learning through SERC.

    Through efforts such as these, SERC has had a substantial impact at MIT and beyond. Over the course of their tenure, Kaiser and Shah have engaged about 80 faculty members, and more than 2,100 students took courses that included new SERC content in the last year alone. SERC’s reach extended well beyond engineering students, with about 500 exposed to SERC content through courses offered in the School of Humanities, Arts, and Social Sciences, the MIT Sloan School of Management, and the School of Architecture and Planning. More

  • in

    MIT welcomes eight MLK Visiting Professors and Scholars for 2022-23

    From space traffic to virus evolution, community journalism to hip-hop, this year’s cohort in the Martin Luther King Jr. (MLK) Visiting Professors and Scholars Program will power an unprecedented range of intellectual pursuits during their time on the MIT campus. 

    “MIT is so fortunate to have this group of remarkable individuals join us,” says Institute Community and Equity Officer John Dozier. “They bring a range and depth of knowledge to share with our students and faculty, and we look forward to working with them to build a stronger sense of community across the Institute.”

    Since its inception in 1990, the MLK Scholars Program has hosted more than 135 visiting professors, practitioners, and intellectuals who enhance and enrich the MIT community through their engagement with students and faculty. The program, which honors the life and legacy of MLK by increasing the presence and recognizing the contributions of underrepresented scholars, is supported by the Office of the Provost with oversight from the Institute Community and Equity Office. 

    In spring 2022, MIT President Rafael Reif committed to MIT to adding two new positions in the MLK Visiting Scholars Program, including an expert in Native American studies. Those additional positions will be filled in the coming year.  

    The 2022-23 MLK Scholars:

    Daniel Auguste is an assistant professor in the Department of Sociology at Florida Atlantic University and is hosted by Roberto Fernandez in MIT Sloan School of Management. Auguste’s research interests include social inequalities in entrepreneurship development. During his visit, Auguste will study the impact of education debt burden and wealth inequality on business ownership and success, and how these consequences differ by race and ethnicity.

    Tawanna Dillahunt is an associate professor in the School of Information at the University of Michigan, where she also holds an appointment with the electrical engineering and computer science department. Catherine D’Ignazio in the Department of Urban Studies and Planning and Fotini Christia in the Institute for Data, Systems, and Society are her faculty hosts. Dillahunt’s scholarship focuses on equitable and inclusive computing. She identifies technological opportunities and implements tools to address and alleviate employment challenges faced by marginalized people. Dillahunt’s visiting appointment begins in September 2023.

    Javit Drake ’94 is a principal scientist in modeling and simulation and measurement sciences at Proctor & Gamble. His faculty host is Fikile Brushett in the Department of Chemical Engineering. An industry researcher with electrochemical energy expertise, Drake is a Course 10 (chemical engineering) alumnus, repeat lecturer, and research affiliate in the department. During his visit, he will continue to work with the Brushett Research Group to deepen his research and understanding of battery technologies while he innovates from those discoveries.

    Eunice Ferreira is an associate professor in the Department of Theater at Skidmore College and is hosted by Claire Conceison in Music and Theater Arts. This fall, Ferreira will teach “Black Theater Matters,” a course where students will explore performance and the cultural production of Black intellectuals and artists on Broadway and in local communities. Her upcoming book projects include “Applied Theatre and Racial Justice: Radical Imaginings for Just Communities” (forthcoming from Routledge) and “Crioulo Performance: Remapping Creole and Mixed Race Theatre” (forthcoming from Vanderbilt University Press). 

    Wasalu Jaco, widely known as Lupe Fiasco, is a rapper, record producer, and entrepreneur. He will be co-hosted by Nick Montfort of Comparative Media Studies/Writing and Mary Fuller of Literature. Jaco’s interests lie in the nexus of rap, computing, and activism. As a former visiting artist in MIT’s Center for Art, Science and Technology (CAST), he will leverage existing collaborations and participate in digital media and art research projects that use computing to explore novel questions related to hip-hop and rap. In addition to his engagement in cross-departmental projects, Jaco will teach a spring course on rap in the media and social contexts.

    Moribah Jah is an associate professor in the Aerospace Engineering and Engineering Mechanics Department at the University of Texas at Austin. He is hosted by Danielle Wood in Media Arts and Sciences and the Department of Aeronautics and Astronautics, and Richard Linares in the Department of Aeronautics and Astronautics. Jah’s research interests include space sustainability and space traffic management; as a visiting scholar, he will develop and strengthen a joint MIT/UT-Austin research program to increase resources and visibility of space sustainability. Jah will also help host the AeroAstro Rising Stars symposium, which highlights graduate students, postdocs, and early-career faculty from backgrounds underrepresented in aerospace engineering. 

    Louis Massiah SM ’82 is a documentary filmmaker and the founder and director of community media of Scribe Video Center, a nonprofit organization that uses media as a tool for social change. His work focuses on empowering Black, Indigenous, and People of Color (BIPOC) filmmakers to tell the stories of/by BIPOC communities. Massiah is hosted by Vivek Bald in Creative Media Studies/Writing. Massiah’s first project will be the launch of a National Community Media Journalism Consortium, a platform to share local news on a broader scale across communities.

    Brian Nord, a scientist at Fermi National Accelerator Laboratory, will join the Laboratory for Nuclear Science, hosted by Jesse Thaler in the Department of Physics. Nord’s research interests include the connection between ethics, justice, and scientific discovery. His efforts will be aimed at introducing new insights into how we model physical systems, design scientific experiments, and approach the ethics of artificial intelligence. As a lead organizer of the Strike for Black Lives in 2020, Nord will engage with justice-oriented members of the MIT physics community to strategize actions for advocacy and activism.

    Brandon Ogbunu, an assistant professor in the Department of Ecology and Evolutionary Biology at Yale University, will be hosted by Matthew Shoulders in the Department of Chemistry. Ogbunu’s research focus is on implementing chemistry and materials science perspectives into his work on virus evolution. In addition to serving as a guest lecturer in graduate courses, he will be collaborating with the Office of Engineering Outreach Programs on their K-12 outreach and recruitment efforts.

    For more information about these scholars and the program, visit mlkscholars.mit.edu. More

  • in

    Exploring emerging topics in artificial intelligence policy

    Members of the public sector, private sector, and academia convened for the second AI Policy Forum Symposium last month to explore critical directions and questions posed by artificial intelligence in our economies and societies.

    The virtual event, hosted by the AI Policy Forum (AIPF) — an undertaking by the MIT Schwarzman College of Computing to bridge high-level principles of AI policy with the practices and trade-offs of governing — brought together an array of distinguished panelists to delve into four cross-cutting topics: law, auditing, health care, and mobility.

    In the last year there have been substantial changes in the regulatory and policy landscape around AI in several countries — most notably in Europe with the development of the European Union Artificial Intelligence Act, the first attempt by a major regulator to propose a law on artificial intelligence. In the United States, the National AI Initiative Act of 2020, which became law in January 2021, is providing a coordinated program across federal government to accelerate AI research and application for economic prosperity and security gains. Finally, China recently advanced several new regulations of its own.

    Each of these developments represents a different approach to legislating AI, but what makes a good AI law? And when should AI legislation be based on binding rules with penalties versus establishing voluntary guidelines?

    Jonathan Zittrain, professor of international law at Harvard Law School and director of the Berkman Klein Center for Internet and Society, says the self-regulatory approach taken during the expansion of the internet had its limitations with companies struggling to balance their interests with those of their industry and the public.

    “One lesson might be that actually having representative government take an active role early on is a good idea,” he says. “It’s just that they’re challenged by the fact that there appears to be two phases in this environment of regulation. One, too early to tell, and two, too late to do anything about it. In AI I think a lot of people would say we’re still in the ‘too early to tell’ stage but given that there’s no middle zone before it’s too late, it might still call for some regulation.”

    A theme that came up repeatedly throughout the first panel on AI laws — a conversation moderated by Dan Huttenlocher, dean of the MIT Schwarzman College of Computing and chair of the AI Policy Forum — was the notion of trust. “If you told me the truth consistently, I would say you are an honest person. If AI could provide something similar, something that I can say is consistent and is the same, then I would say it’s trusted AI,” says Bitange Ndemo, professor of entrepreneurship at the University of Nairobi and the former permanent secretary of Kenya’s Ministry of Information and Communication.

    Eva Kaili, vice president of the European Parliament, adds that “In Europe, whenever you use something, like any medication, you know that it has been checked. You know you can trust it. You know the controls are there. We have to achieve the same with AI.” Kalli further stresses that building trust in AI systems will not only lead to people using more applications in a safe manner, but that AI itself will reap benefits as greater amounts of data will be generated as a result.

    The rapidly increasing applicability of AI across fields has prompted the need to address both the opportunities and challenges of emerging technologies and the impact they have on social and ethical issues such as privacy, fairness, bias, transparency, and accountability. In health care, for example, new techniques in machine learning have shown enormous promise for improving quality and efficiency, but questions of equity, data access and privacy, safety and reliability, and immunology and global health surveillance remain at large.

    MIT’s Marzyeh Ghassemi, an assistant professor in the Department of Electrical Engineering and Computer Science and the Institute for Medical Engineering and Science, and David Sontag, an associate professor of electrical engineering and computer science, collaborated with Ziad Obermeyer, an associate professor of health policy and management at the University of California Berkeley School of Public Health, to organize AIPF Health Wide Reach, a series of sessions to discuss issues of data sharing and privacy in clinical AI. The organizers assembled experts devoted to AI, policy, and health from around the world with the goal of understanding what can be done to decrease barriers to access to high-quality health data to advance more innovative, robust, and inclusive research results while being respectful of patient privacy.

    Over the course of the series, members of the group presented on a topic of expertise and were tasked with proposing concrete policy approaches to the challenge discussed. Drawing on these wide-ranging conversations, participants unveiled their findings during the symposium, covering nonprofit and government success stories and limited access models; upside demonstrations; legal frameworks, regulation, and funding; technical approaches to privacy; and infrastructure and data sharing. The group then discussed some of their recommendations that are summarized in a report that will be released soon.

    One of the findings calls for the need to make more data available for research use. Recommendations that stem from this finding include updating regulations to promote data sharing to enable easier access to safe harbors such as the Health Insurance Portability and Accountability Act (HIPAA) has for de-identification, as well as expanding funding for private health institutions to curate datasets, amongst others. Another finding, to remove barriers to data for researchers, supports a recommendation to decrease obstacles to research and development on federally created health data. “If this is data that should be accessible because it’s funded by some federal entity, we should easily establish the steps that are going to be part of gaining access to that so that it’s a more inclusive and equitable set of research opportunities for all,” says Ghassemi. The group also recommends taking a careful look at the ethical principles that govern data sharing. While there are already many principles proposed around this, Ghassemi says that “obviously you can’t satisfy all levers or buttons at once, but we think that this is a trade-off that’s very important to think through intelligently.”

    In addition to law and health care, other facets of AI policy explored during the event included auditing and monitoring AI systems at scale, and the role AI plays in mobility and the range of technical, business, and policy challenges for autonomous vehicles in particular.

    The AI Policy Forum Symposium was an effort to bring together communities of practice with the shared aim of designing the next chapter of AI. In his closing remarks, Aleksander Madry, the Cadence Designs Systems Professor of Computing at MIT and faculty co-lead of the AI Policy Forum, emphasized the importance of collaboration and the need for different communities to communicate with each other in order to truly make an impact in the AI policy space.

    “The dream here is that we all can meet together — researchers, industry, policymakers, and other stakeholders — and really talk to each other, understand each other’s concerns, and think together about solutions,” Madry said. “This is the mission of the AI Policy Forum and this is what we want to enable.” More