More stories

  • in

    Study: AI models fail to reproduce human judgements about rule violations

    In an effort to improve fairness or reduce backlogs, machine-learning models are sometimes designed to mimic human decision making, such as deciding whether social media posts violate toxic content policies.

    But researchers from MIT and elsewhere have found that these models often do not replicate human decisions about rule violations. If models are not trained with the right data, they are likely to make different, often harsher judgements than humans would.

    In this case, the “right” data are those that have been labeled by humans who were explicitly asked whether items defy a certain rule. Training involves showing a machine-learning model millions of examples of this “normative data” so it can learn a task.

    But data used to train machine-learning models are typically labeled descriptively — meaning humans are asked to identify factual features, such as, say, the presence of fried food in a photo. If “descriptive data” are used to train models that judge rule violations, such as whether a meal violates a school policy that prohibits fried food, the models tend to over-predict rule violations.

    This drop in accuracy could have serious implications in the real world. For instance, if a descriptive model is used to make decisions about whether an individual is likely to reoffend, the researchers’ findings suggest it may cast stricter judgements than a human would, which could lead to higher bail amounts or longer criminal sentences.

    “I think most artificial intelligence/machine-learning researchers assume that the human judgements in data and labels are biased, but this result is saying something worse. These models are not even reproducing already-biased human judgments because the data they’re being trained on has a flaw: Humans would label the features of images and text differently if they knew those features would be used for a judgment. This has huge ramifications for machine learning systems in human processes,” says Marzyeh Ghassemi, an assistant professor and head of the Healthy ML Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL).

    Ghassemi is senior author of a new paper detailing these findings, which was published today in Science Advances. Joining her on the paper are lead author Aparna Balagopalan, an electrical engineering and computer science graduate student; David Madras, a graduate student at the University of Toronto; David H. Yang, a former graduate student who is now co-founder of ML Estimation; Dylan Hadfield-Menell, an MIT assistant professor; and Gillian K. Hadfield, Schwartz Reisman Chair in Technology and Society and professor of law at the University of Toronto.

    Labeling discrepancy

    This study grew out of a different project that explored how a machine-learning model can justify its predictions. As they gathered data for that study, the researchers noticed that humans sometimes give different answers if they are asked to provide descriptive or normative labels about the same data.

    To gather descriptive labels, researchers ask labelers to identify factual features — does this text contain obscene language? To gather normative labels, researchers give labelers a rule and ask if the data violates that rule — does this text violate the platform’s explicit language policy?

    Surprised by this finding, the researchers launched a user study to dig deeper. They gathered four datasets to mimic different policies, such as a dataset of dog images that could be in violation of an apartment’s rule against aggressive breeds. Then they asked groups of participants to provide descriptive or normative labels.

    In each case, the descriptive labelers were asked to indicate whether three factual features were present in the image or text, such as whether the dog appears aggressive. Their responses were then used to craft judgements. (If a user said a photo contained an aggressive dog, then the policy was violated.) The labelers did not know the pet policy. On the other hand, normative labelers were given the policy prohibiting aggressive dogs, and then asked whether it had been violated by each image, and why.

    The researchers found that humans were significantly more likely to label an object as a violation in the descriptive setting. The disparity, which they computed using the absolute difference in labels on average, ranged from 8 percent on a dataset of images used to judge dress code violations to 20 percent for the dog images.

    “While we didn’t explicitly test why this happens, one hypothesis is that maybe how people think about rule violations is different from how they think about descriptive data. Generally, normative decisions are more lenient,” Balagopalan says.

    Yet data are usually gathered with descriptive labels to train a model for a particular machine-learning task. These data are often repurposed later to train different models that perform normative judgements, like rule violations.

    Training troubles

    To study the potential impacts of repurposing descriptive data, the researchers trained two models to judge rule violations using one of their four data settings. They trained one model using descriptive data and the other using normative data, and then compared their performance.

    They found that if descriptive data are used to train a model, it will underperform a model trained to perform the same judgements using normative data. Specifically, the descriptive model is more likely to misclassify inputs by falsely predicting a rule violation. And the descriptive model’s accuracy was even lower when classifying objects that human labelers disagreed about.

    “This shows that the data do really matter. It is important to match the training context to the deployment context if you are training models to detect if a rule has been violated,” Balagopalan says.

    It can be very difficult for users to determine how data have been gathered; this information can be buried in the appendix of a research paper or not revealed by a private company, Ghassemi says.

    Improving dataset transparency is one way this problem could be mitigated. If researchers know how data were gathered, then they know how those data should be used. Another possible strategy is to fine-tune a descriptively trained model on a small amount of normative data. This idea, known as transfer learning, is something the researchers want to explore in future work.

    They also want to conduct a similar study with expert labelers, like doctors or lawyers, to see if it leads to the same label disparity.

    “The way to fix this is to transparently acknowledge that if we want to reproduce human judgment, we must only use data that were collected in that setting. Otherwise, we are going to end up with systems that are going to have extremely harsh moderations, much harsher than what humans would do. Humans would see nuance or make another distinction, whereas these models don’t,” Ghassemi says.

    This research was funded, in part, by the Schwartz Reisman Institute for Technology and Society, Microsoft Research, the Vector Institute, and a Canada Research Council Chain. More

  • in

    Minimizing electric vehicles’ impact on the grid

    National and global plans to combat climate change include increasing the electrification of vehicles and the percentage of electricity generated from renewable sources. But some projections show that these trends might require costly new power plants to meet peak loads in the evening when cars are plugged in after the workday. What’s more, overproduction of power from solar farms during the daytime can waste valuable electricity-generation capacity.

    In a new study, MIT researchers have found that it’s possible to mitigate or eliminate both these problems without the need for advanced technological systems of connected devices and real-time communications, which could add to costs and energy consumption. Instead, encouraging the placing of charging stations for electric vehicles (EVs) in strategic ways, rather than letting them spring up anywhere, and setting up systems to initiate car charging at delayed times could potentially make all the difference.

    The study, published today in the journal Cell Reports Physical Science, is by Zachary Needell PhD ’22, postdoc Wei Wei, and Professor Jessika Trancik of MIT’s Institute for Data, Systems, and Society.

    In their analysis, the researchers used data collected in two sample cities: New York and Dallas. The data were gathered from, among other sources, anonymized records collected via onboard devices in vehicles, and surveys that carefully sampled populations to cover variable travel behaviors. They showed the times of day cars are used and for how long, and how much time the vehicles spend at different kinds of locations — residential, workplace, shopping, entertainment, and so on.

    The findings, Trancik says, “round out the picture on the question of where to strategically locate chargers to support EV adoption and also support the power grid.”

    Better availability of charging stations at workplaces, for example, could help to soak up peak power being produced at midday from solar power installations, which might otherwise go to waste because it is not economical to build enough battery or other storage capacity to save all of it for later in the day. Thus, workplace chargers can provide a double benefit, helping to reduce the evening peak load from EV charging and also making use of the solar electricity output.

    These effects on the electric power system are considerable, especially if the system must meet charging demands for a fully electrified personal vehicle fleet alongside the peaks in other demand for electricity, for example on the hottest days of the year. If unmitigated, the evening peaks in EV charging demand could require installing upwards of 20 percent more power-generation capacity, the researchers say.

    “Slow workplace charging can be more preferable than faster charging technologies for enabling a higher utilization of midday solar resources,” Wei says.

    Meanwhile, with delayed home charging, each EV charger could be accompanied by a simple app to estimate the time to begin its charging cycle so that it charges just before it is needed the next day. Unlike other proposals that require a centralized control of the charging cycle, such a system needs no interdevice communication of information and can be preprogrammed — and can accomplish a major shift in the demand on the grid caused by increasing EV penetration. The reason it works so well, Trancik says, is because of the natural variability in driving behaviors across individuals in a population.

    By “home charging,” the researchers aren’t only referring to charging equipment in individual garages or parking areas. They say it’s essential to make charging stations available in on-street parking locations and in apartment building parking areas as well.

    Trancik says the findings highlight the value of combining the two measures — workplace charging and delayed home charging — to reduce peak electricity demand, store solar energy, and conveniently meet drivers’ charging needs on all days. As the team showed in earlier research, home charging can be a particularly effective component of a strategic package of charging locations; workplace charging, they have found, is not a good substitute for home charging for meeting drivers’ needs on all days.

    “Given that there’s a lot of public money going into expanding charging infrastructure,” Trancik says, “how do you incentivize the location such that this is going to be efficiently and effectively integrated into the power grid without requiring a lot of additional capacity expansion?” This research offers some guidance to policymakers on where to focus rules and incentives.

    “I think one of the fascinating things about these findings is that by being strategic you can avoid a lot of physical infrastructure that you would otherwise need,” she adds. “Your electric vehicles can displace some of the need for stationary energy storage, and you can also avoid the need to expand the capacity of power plants, by thinking about the location of chargers as a tool for managing demands — where they occur and when they occur.”

    Delayed home charging could make a surprising amount of difference, the team found. “It’s basically incentivizing people to begin charging later. This can be something that is preprogrammed into your chargers. You incentivize people to delay the onset of charging by a bit, so that not everyone is charging at the same time, and that smooths out the peak.”

    Such a program would require some advance commitment on the part of participants. “You would need to have enough people committing to this program in advance to avoid the investment in physical infrastructure,” Trancik says. “So, if you have enough people signing up, then you essentially don’t have to build those extra power plants.”

    It’s not a given that all of this would line up just right, and putting in place the right mix of incentives would be crucial. “If you want electric vehicles to act as an effective storage technology for solar energy, then the [EV] market needs to grow fast enough in order to be able to do that,” Trancik says.

    To best use public funds to help make that happen, she says, “you can incentivize charging installations, which would go through ideally a competitive process — in the private sector, you would have companies bidding for different projects, but you can incentivize installing charging at workplaces, for example, to tap into both of these benefits.” Chargers people can access when they are parked near their residences are also important, Trancik adds, but for other reasons. Home charging is one of the ways to meet charging needs while avoiding inconvenient disruptions to people’s travel activities.

    The study was supported by the European Regional Development Fund Operational Program for Competitiveness and Internationalization, the Lisbon Portugal Regional Operation Program, and the Portuguese Foundation for Science and Technology. More

  • in

    Report: CHIPS Act just the first step in addressing threats to US leadership in advanced computing

    When Liu He, a Chinese economist, politician, and “chip czar,” was tapped to lead the charge in a chipmaking arms race with the United States, his message lingered in the air, leaving behind a dewy glaze of tension: “For our country, technology is not just for growth… it is a matter of survival.”

    Once upon a time, the United States’ early technological prowess positioned the nation to outpace foreign rivals and cultivate a competitive advantage for domestic businesses. Yet, 30 years later, America’s lead in advanced computing is continuing to wane. What happened?

    A new report from an MIT researcher and two colleagues sheds light on the decline in U.S. leadership. The scientists looked at high-level measures to examine the shrinkage: overall capabilities, supercomputers, applied algorithms, and semiconductor manufacturing. Through their analysis, they found that not only has China closed the computing gap with the U.S., but nearly 80 percent of American leaders in the field believe that their Chinese competitors are improving capabilities faster — which, the team says, suggests a “broad threat to U.S. competitiveness.”

    To delve deeply into the fray, the scientists conducted the Advanced Computing Users Survey, sampling 120 top-tier organizations, including universities, national labs, federal agencies, and industry. The team estimates that this group comprises one-third and one-half of all the most significant computing users in the United States.

    “Advanced computing is crucial to scientific improvement, economic growth and the competitiveness of U.S. companies,” says Neil Thompson, director of the FutureTech Research Project at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), who helped lead the study.

    Thompson, who is also a principal investigator at MIT’s Initiative on the Digital Economy, wrote the paper with Chad Evans, executive vice president and secretary and treasurer to the board at the Council on Competitiveness, and Daniel Armbrust, who is the co-founder, initial CEO, and member of the board of directors at Silicon Catalyst and former president of SEMATECH, the semiconductor consortium that developed industry roadmaps.

    The semiconductor, supercomputer, and algorithm bonanza

    Supercomputers — the room-sized, “giant calculators” of the hardware world — are an industry no longer dominated by the United States. Through 2015, about half of the most powerful computers were sitting firmly in the U.S., and China was growing slowly from a very slow base. But in the past six years, China has swiftly caught up, reaching near parity with America.

    This disappearing lead matters. Eighty-four percent of U.S. survey respondents said they’re computationally constrained in running essential programs. “This result was telling, given who our respondents are: the vanguard of American research enterprises and academic institutions with privileged access to advanced national supercomputing resources,” says Thompson. 

    With regards to advanced algorithms, historically, the U.S. has fronted the charge, with two-thirds of all significant improvements dominated by U.S.-born inventors. But in recent decades, U.S. dominance in algorithms has relied on bringing in foreign talent to work in the U.S., which the researchers say is now in jeopardy. China has outpaced the U.S. and many other countries in churning out PhDs in STEM fields since 2007, with one report postulating a near-distant future (2025) where China will be home to nearly twice as many PhDs than in the U.S. China’s rise in algorithms can also be seen with the “Gordon Bell Prize,” an achievement for outstanding work in harnessing the power of supercomputers in varied applications. U.S. winners historically dominated the prize, but China has now equaled or surpassed Americans’ performance in the past five years.

    While the researchers note the CHIPS and Science Act of 2022 is a critical step in re-establishing the foundation of success for advanced computing, they propose recommendations to the U.S. Office of Science and Technology Policy. 

    First, they suggest democratizing access to U.S. supercomputing by building more mid-tier systems that push boundaries for many users, as well as building tools so users scaling up computations can have less up-front resource investment. They also recommend increasing the pool of innovators by funding many more electrical engineers and computer scientists being trained with longer-term US residency incentives and scholarships. Finally, in addition to this new framework, the scientists urge taking advantage of what already exists, via providing the private sector access to experimentation with high-performance computing through supercomputing sites in academia and national labs.

    All that and a bag of chips

    Computing improvements depend on continuous advances in transistor density and performance, but creating robust, new chips necessitate a harmonious blend of design and manufacturing.

    Over the last six years, China was not known as the savants of noteworthy chips. In fact, in the past five decades, the U.S. designed most of them. But this changed in the past six years when China created the HiSilicon Kirin 9000, propelling itself to the international frontier. This success was mainly obtained through partnerships with leading global chip designers that began in the 2000s. Now, China now has 14 companies among the world’s top 50 fabless designers. A decade ago, there was only one. 

    Competitive semiconductor manufacturing has been more mixed, where U.S.-led policies and internal execution issues have slowed China’s rise, but as of July 2022, the Semiconductor Manufacturing International Corporation (SMIC) has evidence of 7 nanometer logic, which was not expected until much later. However, with extreme ultraviolet export restrictions, progress below 7 nm means domestic technology development would be expensive. Currently, China is only at parity or better in two out of 12 segments of the semiconductor supply chain. Still, with government policy and investments, the team expects a whopping increase to seven segments in 10 years. So, for the moment, the U.S. retains leadership in hardware manufacturing, but with fewer dimensions of advantage.

    The authors recommend that the White House Office of Science and Technology Policy work with key national agencies, such as the U.S. Department of Defense, U.S. Department of Energy, and the National Science Foundation, to define initiatives to build the hardware and software systems needed for important computing paradigms and workloads critical for economic and security goals. “It is crucial that American enterprises can get the benefit of faster computers,” says Thompson. “With Moore’s Law slowing down, the best way to do this is to create a portfolio of specialized chips (or “accelerators”) that are customized to our needs.”

    The scientists further believe that to lead the next generation of computing, four areas must be addressed. First, by issuing grand challenges to the CHIPS Act National Semiconductor Technology Center, researchers and startups would be motivated to invest in research and development and to seek startup capital for new technologies in areas such as spintronics, neuromorphics, optical and quantum computing, and optical interconnect fabrics. By supporting allies in passing similar acts, overall investment in these technologies would increase, and supply chains would become more aligned and secure. Establishing test beds for researchers to test algorithms on new computing architectures and hardware would provide an essential platform for innovation and discovery. Finally, planning for post-exascale systems that achieve higher levels of performance through next-generation advances would ensure that current commercial technologies don’t limit future computing systems.

    “The advanced computing landscape is in rapid flux — technologically, economically, and politically, with both new opportunities for innovation and rising global rivalries,” says Daniel Reed, Presidential Professor and professor of computer science and electrical and computer engineering at the University of Utah. “The transformational insights from both deep learning and computational modeling depend on both continued semiconductor advances and their instantiation in leading edge, large-scale computing systems — hyperscale clouds and high-performance computing systems. Although the U.S. has historically led the world in both advanced semiconductors and high-performance computing, other nations have recognized that these capabilities are integral to 21st century economic competitiveness and national security, and they are investing heavily.”

    The research was funded, in part, through Thompson’s grant from Good Ventures, which supports his FutureTech Research Group. The paper is being published by the Georgetown Public Policy Review. More

  • in

    3 Questions: Leo Anthony Celi on ChatGPT and medicine

    Launched in November 2022, ChatGPT is a chatbot that can not only engage in human-like conversation, but also provide accurate answers to questions in a wide range of knowledge domains. The chatbot, created by the firm OpenAI, is based on a family of “large language models” — algorithms that can recognize, predict, and generate text based on patterns they identify in datasets containing hundreds of millions of words.

    In a study appearing in PLOS Digital Health this week, researchers report that ChatGPT performed at or near the passing threshold of the U.S. Medical Licensing Exam (USMLE) — a comprehensive, three-part exam that doctors must pass before practicing medicine in the United States. In an editorial accompanying the paper, Leo Anthony Celi, a principal research scientist at MIT’s Institute for Medical Engineering and Science, a practicing physician at Beth Israel Deaconess Medical Center, and an associate professor at Harvard Medical School, and his co-authors argue that ChatGPT’s success on this exam should be a wake-up call for the medical community.

    Q: What do you think the success of ChatGPT on the USMLE reveals about the nature of the medical education and evaluation of students? 

    A: The framing of medical knowledge as something that can be encapsulated into multiple choice questions creates a cognitive framing of false certainty. Medical knowledge is often taught as fixed model representations of health and disease. Treatment effects are presented as stable over time despite constantly changing practice patterns. Mechanistic models are passed on from teachers to students with little emphasis on how robustly those models were derived, the uncertainties that persist around them, and how they must be recalibrated to reflect advances worthy of incorporation into practice. 

    ChatGPT passed an examination that rewards memorizing the components of a system rather than analyzing how it works, how it fails, how it was created, how it is maintained. Its success demonstrates some of the shortcomings in how we train and evaluate medical students. Critical thinking requires appreciation that ground truths in medicine continually shift, and more importantly, an understanding how and why they shift.

    Q: What steps do you think the medical community should take to modify how students are taught and evaluated?  

    A: Learning is about leveraging the current body of knowledge, understanding its gaps, and seeking to fill those gaps. It requires being comfortable with and being able to probe the uncertainties. We fail as teachers by not teaching students how to understand the gaps in the current body of knowledge. We fail them when we preach certainty over curiosity, and hubris over humility.  

    Medical education also requires being aware of the biases in the way medical knowledge is created and validated. These biases are best addressed by optimizing the cognitive diversity within the community. More than ever, there is a need to inspire cross-disciplinary collaborative learning and problem-solving. Medical students need data science skills that will allow every clinician to contribute to, continually assess, and recalibrate medical knowledge.

    Q: Do you see any upside to ChatGPT’s success in this exam? Are there beneficial ways that ChatGPT and other forms of AI can contribute to the practice of medicine? 

    A: There is no question that large language models (LLMs) such as ChatGPT are very powerful tools in sifting through content beyond the capabilities of experts, or even groups of experts, and extracting knowledge. However, we will need to address the problem of data bias before we can leverage LLMs and other artificial intelligence technologies. The body of knowledge that LLMs train on, both medical and beyond, is dominated by content and research from well-funded institutions in high-income countries. It is not representative of most of the world.

    We have also learned that even mechanistic models of health and disease may be biased. These inputs are fed to encoders and transformers that are oblivious to these biases. Ground truths in medicine are continuously shifting, and currently, there is no way to determine when ground truths have drifted. LLMs do not evaluate the quality and the bias of the content they are being trained on. Neither do they provide the level of uncertainty around their output. But the perfect should not be the enemy of the good. There is tremendous opportunity to improve the way health care providers currently make clinical decisions, which we know are tainted with unconscious bias. I have no doubt AI will deliver its promise once we have optimized the data input. More

  • in

    Research, education, and connection in the face of war

    When Russian forces invaded Ukraine in February 2022, Tetiana Herasymova had several decisions to make: What should she do, where should she live, and should she take her MITx MicroMasters capstone exams? She had registered for the Statistics and Data Science Program’s final exams just days prior to moving out of her apartment and into a bomb shelter. Although it was difficult to focus on studying and preparations with air horns sounding overhead and uncertainty lingering around her, she was determined to try. “I wouldn’t let the aggressor in the war squash my dreams,” she says.

    A love of research and the desire to improve teaching 

    An early love of solving puzzles and problems for fun piqued Herasymova’s initial interest in mathematics. When she later pursued her PhD in mathematics at Kiev National Taras Shevchenko University, Herasymova’s love of math evolved into a love of research. Throughout Herasymova’s career, she’s worked to close the gap between scientific researchers and educators. Starting as a math tutor at MBA Strategy, a company that prepares Ukrainian leaders for qualifying standardized tests for MBA programs, she was later promoted as the head of their test preparation department. Afterward, she moved on to an equivalent position at ZNOUA, a new project that prepared high school students for Ukraine’s standardized test, and she eventually became ZNOUA’s CEO.

    In 2018, she founded Prosteer, a “self-learning community” of educators who share research, pedagogy, and experience to learn from one another. “It’s really interesting to have a community of teachers from different domains,” she says, speaking of educators and researchers whose specialties range across language, mathematics, physics, music, and more.

    Implementing new pedagogical research in the classroom is often up to educators who seek out studies on an individual basis, Herasymova has found. “Lots of scientists are not practitioners,” she says, and the reverse is also true. She only became more determined to build these connections once she was promoted to head of test preparation at MBA Strategy because she wanted to share more effective pedagogy with the tutors she was mentoring.

    First, Herasymova knew she needed a way to measure the teachers’ effectiveness. She was able to determine whether students who received the company’s tutoring services improved their scores. Moreover, Ukraine keeps an open-access database of national standardized test scores, so anyone could analyze the data in hopes of improving the level of education in the country. She says, “I could do some analytics because I am a mathematician, but I knew I could do much more with this data if I knew data science and machine learning knowledge.”

    That’s why Herasymova sought out the MITx MicroMasters Program in Statistics and Data Science offered by the MIT Institute for Data, Systems, and Society (IDSS). “I wanted to learn the fundamentals so I could join the Learning Analytics domain,” she says. She was looking for a comprehensive program that covered the foundations without being overly basic. “I had some knowledge from the ground, so I could see the deepness of that course,” she says. Because of her background as an instructional designer, she thought the MicroMasters curriculum was well-constructed, calling the variety of videos, practice problems, and homework assignments that encouraged learners to approach the course material in different ways, “a perfect experience.”

    Another benefit of the MicroMasters program was its online format. “I had my usual work, so it was impossible to study in a stationary way,” she says. She found the structure to be more flexible than other programs. “It’s really great that you can construct your course schedule your own way, especially with your own adult life,” she says.

    Determination and support in the midst of war

    When the war first forced Herasymova to flee her apartment, she had already registered to take the exams for her four courses. “It was quite hard to prepare for exams when you could hear explosions outside of the bomb shelter,” she says. She and other Ukranians were invited to postpone their exams until the following session, but the next available testing period wouldn’t be held until October. “It was a hard decision, but I had to allow myself to try,” she says. “For all people in Ukraine, when you don’t know if you’re going to live or die, you try to live in the now. You have to appreciate every moment and what life brings to you. You don’t say, ‘Someday’ — you do it today or tomorrow.”

    In addition to emotional support from her boyfriend, Herasymova had a group of friends who had also enrolled in the program, and they supported each other through study sessions and an ongoing chat. Herasymova’s personal support network helped her accomplish what she set out to do with her MicroMasters program, and in turn, she was able to support her professional network. While Prosteer halted its regular work during the early stages of the war, Herasymova was determined to support the community of educators and scientists that she had built. They continued meeting weekly to exchange ideas as usual. “It’s intrinsic motivation,” she says. They managed to restore all of their activities by October.

    Despite the factors stacked against her, Herasymova’s determination paid off — she passed all of her exams in May, the final step to earning her MicroMasters certificate in statistics and data science. “I just couldn’t believe it,” she says. “It was definitely a bifurcation point. The moment when you realize that you have something to rely on, and that life is just beginning to show all its diversity despite the fact that you live in war.” With her newly minted certificate in hand, Herasymova has continued her research on the effectiveness of educational models — analyzing the data herself — with a summer research program at New York University. 

    The student becomes the master

    After moving seven times between February and October, heading west from Kyiv until most recently settling near the border of Poland, Herasymova hopes she’s moved for the last time. Ukrainian Catholic University offered her a position teaching both mathematics and programming. Before enrolling in the MicroMasters Program in Statistics and Data Science, she had some prior knowledge of programming languages and mathematical algorithms, but she didn’t know Python. She took MITx’s Introduction to Computer Science and Programming Using Python to prepare. “It gave me a huge step forward,” she says. “I learned a lot. Now, not only can I work with Python machine learning models in programming language R, I also have knowledge of the big picture of the purpose and the point to do so.”

    In addition to the skills the MicroMasters Program trained her in, she gained firsthand experience in learning new subjects and exploring topics more deeply. She will be sharing that practice with the community of students and teachers she’s built, plus, she plans on guiding them through this course during the next year. As a continuation of her own educational growth, says she’s looking forward to her next MITx course this year, Data Analysis.

    Herasymova advises that the best way to keep progressing is investing a lot of time. “Adults don’t want to hear this, but you need one or two years,” she says. “Allow yourself to be stupid. If you’re an expert in one domain and want to switch to another, or if you want to understand something new, a lot of people don’t ask questions or don’t ask for help. But from this point, if I don’t know something, I know I should ask for help because that’s the start of learning. With a fixed mindset, you won’t grow.”

    July 2022 MicroMasters Program Joint Completion Celebration. Ukrainian student Tetiana Herasymova, who completed her program amid war in her home country, speaks at 43:55. More

  • in

    Gaining real-world industry experience through Break Through Tech AI at MIT

    Taking what they learned conceptually about artificial intelligence and machine learning (ML) this year, students from across the Greater Boston area had the opportunity to apply their new skills to real-world industry projects as part of an experiential learning opportunity offered through Break Through Tech AI at MIT.

    Hosted by the MIT Schwarzman College of Computing, Break Through Tech AI is a pilot program that aims to bridge the talent gap for women and underrepresented genders in computing fields by providing skills-based training, industry-relevant portfolios, and mentoring to undergraduate students in regional metropolitan areas in order to position them more competitively for careers in data science, machine learning, and artificial intelligence.

    “Programs like Break Through Tech AI gives us opportunities to connect with other students and other institutions, and allows us to bring MIT’s values of diversity, equity, and inclusion to the learning and application in the spaces that we hold,” says Alana Anderson, assistant dean of diversity, equity, and inclusion for the MIT Schwarzman College of Computing.

    The inaugural cohort of 33 undergraduates from 18 Greater Boston-area schools, including Salem State University, Smith College, and Brandeis University, began the free, 18-month program last summer with an eight-week, online skills-based course to learn the basics of AI and machine learning. Students then split into small groups in the fall to collaborate on six machine learning challenge projects presented to them by MathWorks, MIT-IBM Watson AI Lab, and Replicate. The students dedicated five hours or more each week to meet with their teams, teaching assistants, and project advisors, including convening once a month at MIT, while juggling their regular academic course load with other daily activities and responsibilities.

    The challenges gave the undergraduates the chance to help contribute to actual projects that industry organizations are working on and to put their machine learning skills to the test. Members from each organization also served as project advisors, providing encouragement and guidance to the teams throughout.

    “Students are gaining industry experience by working closely with their project advisors,” says Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing and the MIT director of the MIT-IBM Watson AI Lab. “These projects will be an add-on to their machine learning portfolio that they can share as a work example when they’re ready to apply for a job in AI.”

    Over the course of 15 weeks, teams delved into large-scale, real-world datasets to train, test, and evaluate machine learning models in a variety of contexts.

    In December, the students celebrated the fruits of their labor at a showcase event held at MIT in which the six teams gave final presentations on their AI projects. The projects not only allowed the students to build up their AI and machine learning experience, it helped to “improve their knowledge base and skills in presenting their work to both technical and nontechnical audiences,” Oliva says.

    For a project on traffic data analysis, students got trained on MATLAB, a programming and numeric computing platform developed by MathWorks, to create a model that enables decision-making in autonomous driving by predicting future vehicle trajectories. “It’s important to realize that AI is not that intelligent. It’s only as smart as you make it and that’s exactly what we tried to do,” said Brandeis University student Srishti Nautiyal as she introduced her team’s project to the audience. With companies already making autonomous vehicles from planes to trucks a reality, Nautiyal, a physics and mathematics major, shared that her team was also highly motivated to consider the ethical issues of the technology in their model for the safety of passengers, drivers, and pedestrians.

    Using census data to train a model can be tricky because they are often messy and full of holes. In a project on algorithmic fairness for the MIT-IBM Watson AI Lab, the hardest task for the team was having to clean up mountains of unorganized data in a way where they could still gain insights from them. The project — which aimed to create demonstration of fairness applied on a real dataset to evaluate and compare effectiveness of different fairness interventions and fair metric learning techniques — could eventually serve as an educational resource for data scientists interested in learning about fairness in AI and using it in their work, as well as to promote the practice of evaluating the ethical implications of machine learning models in industry.

    Other challenge projects included an ML-assisted whiteboard for nontechnical people to interact with ready-made machine learning models, and a sign language recognition model to help disabled people communicate with others. A team that worked on a visual language app set out to include over 50 languages in their model to increase access for the millions of people that are visually impaired throughout the world. According to the team, similar apps on the market currently only offer up to 23 languages. 

    Throughout the semester, students persisted and demonstrated grit in order to cross the finish line on their projects. With the final presentations marking the conclusion of the fall semester, students will return to MIT in the spring to continue their Break Through Tech AI journey to tackle another round of AI projects. This time, the students will work with Google on new machine learning challenges that will enable them to hone their AI skills even further with an eye toward launching a successful career in AI. More

  • in

    Q&A: A fresh look at data science

    As the leaders of a developing field, data scientists must often deal with a frustratingly slippery question: What is data science, precisely, and what is it good for?

    Alfred Spector is a visiting scholar in the MIT Department of Electrical Engineering and Computer Science (EECS), an influential developer of distributed computing systems and applications, and a successful tech executive with companies including IBM and Google. Along with three co-authors — Peter Norvig at Stanford University and Google, Chris Wiggins at Columbia University and The New York Times, and Jeannette M. Wing at Columbia — Spector recently published “Data Science in Context: Foundations, Challenges, Opportunities” (Cambridge University Press), which provides a broad, conversational overview of the wide-ranging field driving change in sectors ranging from health care to transportation to commerce to entertainment. 

    Here, Spector talks about data-driven life, what makes a good data scientist, and how his book came together during the height of the Covid-19 pandemic.

    Q: One of the most common buzzwords Americans hear is “data-driven,” but many might not know what that term is supposed to mean. Can you unpack it for us?

    A: Data-driven broadly refers to techniques or algorithms powered by data — they either provide insight or reach conclusions, say, a recommendation or a prediction. The algorithms power models which are increasingly woven into the fabric of science, commerce, and life, and they often provide excellent results. The list of their successes is really too long to even begin to list. However, one concern is that the proliferation of data makes it easy for us as students, scientists, or just members of the public to jump to erroneous conclusions. As just one example, our own confirmation biases make us prone to believing some data elements or insights “prove” something we already believe to be true. Additionally, we often tend to see causal relationships where the data only shows correlation. It might seem paradoxical, but data science makes critical reading and analysis of data all the more important.

    Q: What, to your mind, makes a good data scientist?

    A: [In talking to students and colleagues] I optimistically emphasize the power of data science and the importance of gaining the computational, statistical, and machine learning skills to apply it. But, I also remind students that we are obligated to solve problems well. In our book, Chris [Wiggins] paraphrases danah boyd, who says that a successful application of data science is not one that merely meets some technical goal, but one that actually improves lives. More specifically, I exhort practitioners to provide a real solution to problems, or else clearly identify what we are not solving so that people see the limitations of our work. We should be extremely clear so that we do not generate harmful results or lead others to erroneous conclusions. I also remind people that all of us, including scientists and engineers, are human and subject to the same human foibles as everyone else, such as various biases. 

    Q: You discuss Covid-19 in your book. While some short-range models for mortality were very accurate during the heart of the pandemic, you note the failure of long-range models to predict any of 2020’s four major geotemporal Covid waves in the United States. Do you feel Covid was a uniquely hard situation to model? 

    A: Covid was particularly difficult to predict over the long term because of many factors — the virus was changing, human behavior was changing, political entities changed their minds. Also, we didn’t have fine-grained mobility data (perhaps, for good reasons), and we lacked sufficient scientific understanding of the virus, particularly in the first year.

    I think there are many other domains which are similarly difficult. Our book teases out many reasons why data-driven models may not be applicable. Perhaps it’s too difficult to get or hold the necessary data. Perhaps the past doesn’t predict the future. If data models are being used in life-and-death situations, we may not be able to make them sufficiently dependable; this is particularly true as we’ve seen all the motivations that bad actors have to find vulnerabilities. So, as we continue to apply data science, we need to think through all the requirements we have, and the capability of the field to meet them. They often align, but not always. And, as data science seeks to solve problems into ever more important areas such as human health, education, transportation safety, etc., there will be many challenges.

    Q: Let’s talk about the power of good visualization. You mention the popular, early 2000’s Baby Name Voyager website as one that changed your view on the importance of data visualization. Tell us how that happened. 

    A: That website, recently reborn as the Name Grapher, had two characteristics that I thought were brilliant. First, it had a really natural interface, where you type the initial characters of a name and it shows a frequency graph of all the names beginning with those letters, and their popularity over time. Second, it’s so much better than a spreadsheet with 140 columns representing years and rows representing names, despite the fact it contains no extra information. It also provided instantaneous feedback with its display graph dynamically changing as you type. To me, this showed the power of a very simple transformation that is done correctly.

    Q: When you and your co-authors began planning “Data Science In Context,” what did you hope to offer?

    A: We portray present data science as a field that’s already had enormous benefits, that provides even more future opportunities, but one that requires equally enormous care in its use. Referencing the word “context” in the title, we explain that the proper use of data science must consider the specifics of the application, the laws and norms of the society in which the application is used, and even the time period of its deployment. And, importantly for an MIT audience, the practice of data science must go beyond just the data and the model to the careful consideration of an application’s objectives, its security, privacy, abuse, and resilience risks, and even the understandability it conveys to humans. Within this expansive notion of context, we finally explain that data scientists must also carefully consider ethical trade-offs and societal implications.

    Q: How did you keep focus throughout the process?

    A: Much like in open-source projects, I played both the coordinating author role and also the role of overall librarian of all the material, but we all made significant contributions. Chris Wiggins is very knowledgeable on the Belmont principles and applied ethics; he was the major contributor of those sections. Peter Norvig, as the coauthor of a bestselling AI textbook, was particularly involved in the sections on building models and causality. Jeannette Wing worked with me very closely on our seven-element Analysis Rubric and recognized that a checklist for data science practitioners would end up being one of our book’s most important contributions. 

    From a nuts-and-bolts perspective, we wrote the book during Covid, using one large shared Google doc with weekly video conferences. Amazingly enough, Chris, Jeannette, and I didn’t meet in person at all, and Peter and I met only once — sitting outdoors on a wooden bench on the Stanford campus.

    Q: That is an unusual way to write a book! Do you recommend it?

    A: It would be nice to have had more social interaction, but a shared document, at least with a coordinating author, worked pretty well for something up to this size. The benefit is that we always had a single, coherent textual base, not dissimilar to how a programming team works together.

    This is a condensed, edited version of a longer interview that originally appeared on the MIT EECS website. More

  • in

    Simulating discrimination in virtual reality

    Have you ever been advised to “walk a mile in someone else’s shoes?” Considering another person’s perspective can be a challenging endeavor — but recognizing our errors and biases is key to building understanding across communities. By challenging our preconceptions, we confront prejudice, such as racism and xenophobia, and potentially develop a more inclusive perspective about others.

    To assist with perspective-taking, MIT researchers have developed “On the Plane,” a virtual reality role-playing game (VR RPG) that simulates discrimination. In this case, the game portrays xenophobia directed against a Malaysian America woman, but the approach can be generalized. Situated on an airplane, players can take on the role of characters from different backgrounds, engaging in dialogue with others while making in-game choices to a series of prompts. In turn, players’ decisions control the outcome of a tense conversation between the characters about cultural differences.

    As a VR RPG, “On the Plane” encourages players to take on new roles that may be outside of their personal experiences in the first person, allowing them to confront in-group/out-group bias by incorporating new perspectives into their understanding of different cultures. Players engage with three characters: Sarah, a first-generation Muslim American of Malaysian ancestry who wears a hijab; Marianne, a white woman from the Midwest with little exposure to other cultures and customs; or a flight attendant. Sarah represents the out group, Marianne is a member of the in group, and the flight staffer is a bystander witnessing an exchange between the two passengers.“This project is part of our efforts to harness the power of virtual reality and artificial intelligence to address social ills, such as discrimination and xenophobia,” says Caglar Yildirim, an MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) research scientist who is a co-author and co-game designer on the project. “Through the exchange between the two passengers, players experience how one passenger’s xenophobia manifests itself and how it affects the other passenger. The simulation engages players in critical reflection and seeks to foster empathy for the passenger who was ‘othered’ due to her outfit being not so ‘prototypical’ of what an American should look like.”

    Yildirim worked alongside the project’s principal investigator, D. Fox Harrell, MIT professor of digital media and AI at CSAIL, the Program in Comparative Media Studies/Writing (CMS), and the Institute for Data, Systems, and Society (IDSS) and founding director of the MIT Center for Advanced Virtuality. “It is not possible for a simulation to give someone the life experiences of another person, but while you cannot ‘walk in someone else’s shoes’ in that sense, a system like this can help people recognize and understand the social patterns at work when it comes to issue like bias,” says Harrell, who is also co-author and designer on this project. “An engaging, immersive, interactive narrative can also impact people emotionally, opening the door for users’ perspectives to be transformed and broadened.” This simulation also utilizes an interactive narrative engine that creates several options for responses to in-game interactions based on a model of how people are categorized socially. The tool grants players a chance to alter their standing in the simulation through their reply choices to each prompt, affecting their affinity toward the other two characters. For example, if you play as the flight attendant, you can react to Marianne’s xenophobic expressions and attitudes toward Sarah, changing your affinities. The engine will then provide you with a different set of narrative events based on your changes in standing with others.

    To animate each avatar, “On the Plane” incorporates artificial intelligence knowledge representation techniques controlled by probabilistic finite state machines, a tool commonly used in machine learning systems for pattern recognition. With the help of these machines, characters’ body language and gestures are customizable: if you play as Marianne, the game will customize her mannerisms toward Sarah based on user inputs, impacting how comfortable she appears in front of a member of a perceived out group. Similarly, players can do the same from Sarah or the flight attendant’s point of view.In a 2018 paper based on work done in a collaboration between MIT CSAIL and the Qatar Computing Research Institute, Harrell and co-author Sercan Şengün advocated for virtual system designers to be more inclusive of Middle Eastern identities and customs. They claimed that if designers allowed users to customize virtual avatars more representative of their background, it might empower players to engage in a more supportive experience. Four years later, “On the Plane” accomplishes a similar goal, incorporating a Muslim’s perspective into an immersive environment.

    “Many virtual identity systems, such as avatars, accounts, profiles, and player characters, are not designed to serve the needs of people across diverse cultures. We have used statistical and AI methods in conjunction with qualitative approaches to learn where the gaps are,” they note. “Our project helps engender perspective transformation so that people will treat each other with respect and enhanced understanding across diverse cultural avatar representations.”

    Harrell and Yildirim’s work is part of the MIT IDSS’s Initiative on Combatting Systemic Racism (ICSR). Harrell is on the initiative’s steering committee and is the leader of the newly forming Antiracism, Games, and Immersive Media vertical, who study behavior, cognition, social phenomena, and computational systems related to race and racism in video games and immersive experiences.

    The researchers’ latest project is part of the ICSR’s broader goal to launch and coordinate cross-disciplinary research that addresses racially discriminatory processes across American institutions. Using big data, members of the research initiative develop and employ computing tools that drive racial equity. Yildirim and Harrell accomplish this goal by depicting a frequent, problematic scenario that illustrates how bias creeps into our everyday lives.“In a post-9/11 world, Muslims often experience ethnic profiling in American airports. ‘On the Plane’ builds off of that type of in-group favoritism, a well-established finding in psychology,” says MIT Professor Fotini Christia, director of the Sociotechnical Systems Research Center (SSRC) and associate director or IDSS. “This game also takes a novel approach to analyzing hardwired bias by utilizing VR instead of field experiments to simulate prejudice. Excitingly, this research demonstrates that VR can be used as a tool to help us better measure bias, combating systemic racism and other forms of discrimination.”“On the Plane” was developed on the Unity game engine using the XR Interaction Toolkit and Harrell’s Chimeria platform for authoring interactive narratives that involve social categorization. The game will be deployed for research studies later this year on both desktop computers and the standalone, wireless Meta Quest headsets. A paper on the work was presented in December at the 2022 IEEE International Conference on Artificial Intelligence and Virtual Reality. More