More stories

  • in

    Democratizing education: Bringing MIT excellence to the masses

    How do you quantify the value of education or measure success? For the team behind the MIT Institute for Data, Systems, and Society’s (IDSS) MicroMasters Program in Statistics and Data Science (SDS), providing over 1,000 individuals from around the globe with access to MIT-level programming feels like a pretty good place to start. 

    Thanks to the MIT-conceived MicroMasters-style format, SDS faculty director Professor Devavrat Shah and his colleagues have eliminated the physical restrictions created by a traditional brick-and-mortar education, allowing 1,178 learners and counting from 89 countries access to an MIT education.

    “Taking classes from a Nobel Prize winner doesn’t happen every day,” says Oscar Vele, a strategic development worker for the town of Cuenca, Ecuador. “My dream has always been to study at MIT. I knew it was not easy — now, through this program, my dream came true.”

    “With an online forum, in principle, admission is no longer the gate — the merit is a gate,” says Shah. “If you take a class that is MIT-level, and if you perform at MIT-level, then you should get MIT-level credentials.”

    The MM SDS program, delivered in collaboration with MIT Open Learning, plays a key role in the IDSS mission of advancing education in data science, and supports MIT’s overarching belief that everyone should be able to access a quality education no matter what their life circumstances may be.

    “Getting a program like this up and running to the point where it has credentials and credibility across the globe, is an important milestone for us,” says Shah. “Basically, for us, it says we are here to stay, and we are just getting started.”

    Since the program launched in 2018, Shah says he and his team have seen learners from all walks of life, from high-schoolers looking for a challenge to late-in-life learners looking to either evolve or refresh their knowledge.

    “Then there are individuals who want to prove to themselves that they can achieve serious knowledge and build a career,” Shah says. “Circumstances throughout their lives, whether it’s the country or socioeconomic conditions they’re born in, they have never had the opportunity to do something like this, and now they have an MIT-level education and credentials, which is a huge deal for them.”

    Many learners overcome challenges to complete the program, from financial hardships to balancing work, home life, and coursework, and finding private, internet-enabled space for learning — not to mention the added complications of a global pandemic. One Ukrainian learner even finished the program after fleeing her apartment for a bomb shelter.

    Remapping the way to a graduate degree

    For Diogo da Silva Branco Magalhaes, a 44-year-old lifelong learner, curiosity and the desire to evolve within his current profession brought him to the MicroMasters program. Having spent 15 years working in the public transport sector, da Silva Branco Magalhaes had a very specific challenge at the front of his mind: artificial intelligence.

    “It’s not science fiction; it’s already here,” he says. “Think about autonomous vehicles, on-demand transportation, mobility as a service — AI and data, in particular, are the driving force of a number of disruptions that will affect my industry.”

    When he signed up for the MicroMasters Program in Statistics and Data Science, da Silva Branco Magalhaes’ said he had no long-term plans, but was taking a first step. “I just wanted to have a first contact with this reality, understand the basics, and then let’s see how it goes,” he describes.

    Now, after earning his credentials in 2021, he finds himself a few weeks into an accelerated master’s program at Northwestern University, one of several graduate pathways supported by the MM SDS program.

    “I was really looking to gain some basic background knowledge; I didn’t expect the level of quality and depth they were able to provide in an online lecture format,” he says. “Having access to this kind of content — it’s a privilege, and now that we have it, we have to make the most of it.”

    A refreshing investment

    As an applied mathematician with 15 years of experience in the U.S. defense sector, Celia Wilson says she felt comfortable with her knowledge, though not 100 percent confident that her math skills could stand up against the next generation.

    “I felt I was getting left behind,” she says. “So I decided to take some time out and invest in myself, and this program was a great opportunity to systematize and refresh my knowledge of statistics and data science.”

    Since completing the course, Wilson says she has secured a new job as a director of data and analytics, where she is confident in her ability to manage a team of the “new breed of data scientists.” It turns out, however, that completing the program has given her an even greater gift than self-confidence.

    “Most importantly,” she adds, “it’s inspired my daughters to tell anyone who will listen that math is definitely for girls.”

    Connecting an engaged community

    Each course is connected to an online forum that allows learners to enhance their experience through real-time conversations with others in their cohort.

    “We have worked hard to provide a scalable version of the traditional teaching assistant support system that you would get in a usual on-campus class, with a great online forum for people to connect with each other as learners,” Shah says.

    David Khachatrian, a data scientist working on improving the drug discovery pipeline, says that leveraging the community to hone his ability to “think clearly and communicate effectively with others” mattered more than anything.

    “Take the opportunity to engage with your community of fellow learners and facilitators — answer questions for others to give back to the community, solidify your own understanding, and practice your ability to explain clearly,” Khachatrian says. “These skills and behaviors will help you to succeed not just in SDS, but wherever you go in the future.”

    “There were a lot of active contributions from a lot of learners and I felt it was really a very strong component of the course,” da Silva Branco Magalhaes adds. “I had some offline contact with other students who are connections that I’ve kept up with to this day.”

    A solid path forward

    “We have a dedicated team supporting the MM SDS community on the MIT side,” Shah says, citing the contributions of Karene Chu, MM SDS assistant director of education; Susana Kevorkova, the MM SDS program manager; and Jeremy Rossen, MM program coordinator. “They’ve done so much to ensure the success of the program and our learners, and they are constantly adding value to the program — like identifying real-time supplementary opportunities for learners to participate in, including the IDSS Policy Hackathon.”

    The program now holds online “graduation” ceremonies, where credential holders from all over the world share their experiences. Says Shah, who looks forward to celebrating the next 1,000 learners: “Every time I think about it, I feel emotional. It feels great, and it keeps us going.” More

  • in

    MIT community members elected to the National Academy of Engineering for 2023

    Seven MIT researchers are among the 106 new members and 18 international members elected to the National Academy of Engineering (NAE) this week. Fourteen additional MIT alumni, including one member of the MIT Corporation, were also elected as new members.

    One of the highest professional distinctions for engineers, membership to the NAE is given to individuals who have made outstanding contributions to “engineering research, practice, or education, including, where appropriate, significant contributions to the engineering literature” and to “the pioneering of new and developing fields of technology, making major advancements in traditional fields of engineering, or developing/implementing innovative approaches to engineering education.”

    The seven MIT researchers elected this year include:

    Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in the Department of Electrical Engineering and Computer Science, principal investigator at the Computer Science and Artificial Intelligence Laboratory, and faculty lead for the MIT Abdul Latif Jameel Clinic for Machine Learning in Health, for machine learning models that understand structures in text, molecules, and medical images.

    Markus J. Buehler, the Jerry McAfee (1940) Professor in Engineering from the Department of Civil and Environmental Engineering, for implementing the use of nanomechanics to model and design fracture-resistant bioinspired materials.

    Elfatih A.B. Eltahir SM ’93, ScD ’93, the H.M. King Bhumibol Professor in the Department of Civil and Environmental Engineering, for advancing understanding of how climate and land use impact water availability, environmental and human health, and vector-borne diseases.

    Neil Gershenfeld, director of the Center for Bits and Atoms, for eliminating boundaries between digital and physical worlds, from quantum computing to digital materials to the internet of things.

    Roger D. Kamm SM ’73, PhD ’77, the Cecil and Ida Green Distinguished Professor of Biological and Mechanical Engineering, for contributions to the understanding of mechanics in biology and medicine, and leadership in biomechanics.

    David W. Miller ’82, SM ’85, ScD ’88, the Jerome C. Hunsaker Professor in the Department of Aeronautics and Astronautics, for contributions in control technology for space-based telescope design, and leadership in cross-agency guidance of space technology.

    David Simchi-Levi, professor of civil and environmental engineering, core faculty member in the Institute for Data, Systems, and Society, and principal investigator at the Laboratory for Information and Decision Systems, for contributions using optimization and stochastic modeling to enhance supply chain management and operations.

    Fariborz Maseeh ScD ’90, life member of the MIT Corporation and member of the School of Engineering Dean’s Advisory Council, was also elected as a member for leadership and advances in efficient design, development, and manufacturing of microelectromechanical systems, and for empowering engineering talent through public service.

    Thirteen additional alumni were elected to the National Academy of Engineering this year. They are: Mark George Allen SM ’86, PhD ’89; Shorya Awtar ScD ’04; Inderjit Chopra ScD ’77; David Huang ’85, SM ’89, PhD ’93; Eva Lerner-Lam SM ’78; David F. Merrion SM ’59; Virginia Norwood ’47; Martin Gerard Plys ’80, SM ’81, ScD ’84; Mark Prausnitz PhD ’94; Anil Kumar Sachdev ScD ’77; Christopher Scholz PhD ’67; Melody Ann Swartz PhD ’98; and Elias Towe ’80, SM ’81, PhD ’87.

    “I am delighted that seven members of MIT’s faculty and many members of the wider MIT community were elected to the National Academy of Engineering this year,” says Anantha Chandrakasan, the dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “My warmest congratulations on this recognition of their many contributions to engineering research and education.”

    Including this year’s inductees, 156 members of the National Academy of Engineering are current or retired members of the MIT faculty and staff, or members of the MIT Corporation. More

  • in

    Efficient technique improves machine-learning models’ reliability

    Powerful machine-learning models are being used to help people tackle tough problems such as identifying disease in medical images or detecting road obstacles for autonomous vehicles. But machine-learning models can make mistakes, so in high-stakes settings it’s critical that humans know when to trust a model’s predictions.

    Uncertainty quantification is one tool that improves a model’s reliability; the model produces a score along with the prediction that expresses a confidence level that the prediction is correct. While uncertainty quantification can be useful, existing methods typically require retraining the entire model to give it that ability. Training involves showing a model millions of examples so it can learn a task. Retraining then requires millions of new data inputs, which can be expensive and difficult to obtain, and also uses huge amounts of computing resources.

    Researchers at MIT and the MIT-IBM Watson AI Lab have now developed a technique that enables a model to perform more effective uncertainty quantification, while using far fewer computing resources than other methods, and no additional data. Their technique, which does not require a user to retrain or modify a model, is flexible enough for many applications.

    The technique involves creating a simpler companion model that assists the original machine-learning model in estimating uncertainty. This smaller model is designed to identify different types of uncertainty, which can help researchers drill down on the root cause of inaccurate predictions.

    “Uncertainty quantification is essential for both developers and users of machine-learning models. Developers can utilize uncertainty measurements to help develop more robust models, while for users, it can add another layer of trust and reliability when deploying models in the real world. Our work leads to a more flexible and practical solution for uncertainty quantification,” says Maohao Shen, an electrical engineering and computer science graduate student and lead author of a paper on this technique.

    Shen wrote the paper with Yuheng Bu, a former postdoc in the Research Laboratory of Electronics (RLE) who is now an assistant professor at the University of Florida; Prasanna Sattigeri, Soumya Ghosh, and Subhro Das, research staff members at the MIT-IBM Watson AI Lab; and senior author Gregory Wornell, the Sumitomo Professor in Engineering who leads the Signals, Information, and Algorithms Laboratory RLE and is a member of the MIT-IBM Watson AI Lab. The research will be presented at the AAAI Conference on Artificial Intelligence.

    Quantifying uncertainty

    In uncertainty quantification, a machine-learning model generates a numerical score with each output to reflect its confidence in that prediction’s accuracy. Incorporating uncertainty quantification by building a new model from scratch or retraining an existing model typically requires a large amount of data and expensive computation, which is often impractical. What’s more, existing methods sometimes have the unintended consequence of degrading the quality of the model’s predictions.

    The MIT and MIT-IBM Watson AI Lab researchers have thus zeroed in on the following problem: Given a pretrained model, how can they enable it to perform effective uncertainty quantification?

    They solve this by creating a smaller and simpler model, known as a metamodel, that attaches to the larger, pretrained model and uses the features that larger model has already learned to help it make uncertainty quantification assessments.

    “The metamodel can be applied to any pretrained model. It is better to have access to the internals of the model, because we can get much more information about the base model, but it will also work if you just have a final output. It can still predict a confidence score,” Sattigeri says.

    They design the metamodel to produce the uncertainty quantification output using a technique that includes both types of uncertainty: data uncertainty and model uncertainty. Data uncertainty is caused by corrupted data or inaccurate labels and can only be reduced by fixing the dataset or gathering new data. In model uncertainty, the model is not sure how to explain the newly observed data and might make incorrect predictions, most likely because it hasn’t seen enough similar training examples. This issue is an especially challenging but common problem when models are deployed. In real-world settings, they often encounter data that are different from the training dataset.

    “Has the reliability of your decisions changed when you use the model in a new setting? You want some way to have confidence in whether it is working in this new regime or whether you need to collect training data for this particular new setting,” Wornell says.

    Validating the quantification

    Once a model produces an uncertainty quantification score, the user still needs some assurance that the score itself is accurate. Researchers often validate accuracy by creating a smaller dataset, held out from the original training data, and then testing the model on the held-out data. However, this technique does not work well in measuring uncertainty quantification because the model can achieve good prediction accuracy while still being over-confident, Shen says.

    They created a new validation technique by adding noise to the data in the validation set — this noisy data is more like out-of-distribution data that can cause model uncertainty. The researchers use this noisy dataset to evaluate uncertainty quantifications.

    They tested their approach by seeing how well a meta-model could capture different types of uncertainty for various downstream tasks, including out-of-distribution detection and misclassification detection. Their method not only outperformed all the baselines in each downstream task but also required less training time to achieve those results.

    This technique could help researchers enable more machine-learning models to effectively perform uncertainty quantification, ultimately aiding users in making better decisions about when to trust predictions.

    Moving forward, the researchers want to adapt their technique for newer classes of models, such as large language models that have a different structure than a traditional neural network, Shen says.

    The work was funded, in part, by the MIT-IBM Watson AI Lab and the U.S. National Science Foundation. More

  • in

    When should data scientists try a new technique?

    If a scientist wanted to forecast ocean currents to understand how pollution travels after an oil spill, she could use a common approach that looks at currents traveling between 10 and 200 kilometers. Or, she could choose a newer model that also includes shorter currents. This might be more accurate, but it could also require learning new software or running new computational experiments. How to know if it will be worth the time, cost, and effort to use the new method?

    A new approach developed by MIT researchers could help data scientists answer this question, whether they are looking at statistics on ocean currents, violent crime, children’s reading ability, or any number of other types of datasets.

    The team created a new measure, known as the “c-value,” that helps users choose between techniques based on the chance that a new method is more accurate for a specific dataset. This measure answers the question “is it likely that the new method is more accurate for this data than the common approach?”

    Traditionally, statisticians compare methods by averaging a method’s accuracy across all possible datasets. But just because a new method is better for all datasets on average doesn’t mean it will actually provide a better estimate using one particular dataset. Averages are not application-specific.

    So, researchers from MIT and elsewhere created the c-value, which is a dataset-specific tool. A high c-value means it is unlikely a new method will be less accurate than the original method on a specific data problem.

    In their proof-of-concept paper, the researchers describe and evaluate the c-value using real-world data analysis problems: modeling ocean currents, estimating violent crime in neighborhoods, and approximating student reading ability at schools. They show how the c-value could help statisticians and data analysts achieve more accurate results by indicating when to use alternative estimation methods they otherwise might have ignored.

    “What we are trying to do with this particular work is come up with something that is data specific. The classical notion of risk is really natural for someone developing a new method. That person wants their method to work well for all of their users on average. But a user of a method wants something that will work on their individual problem. We’ve shown that the c-value is a very practical proof-of-concept in that direction,” says senior author Tamara Broderick, an associate professor in the Department of Electrical Engineering and Computer Science (EECS) and a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society.

    She’s joined on the paper by Brian Trippe PhD ’22, a former graduate student in Broderick’s group who is now a postdoc at Columbia University; and Sameer Deshpande ’13, a former postdoc in Broderick’s group who is now an assistant professor at the University of Wisconsin at Madison. An accepted version of the paper is posted online in the Journal of the American Statistical Association.

    Evaluating estimators

    The c-value is designed to help with data problems in which researchers seek to estimate an unknown parameter using a dataset, such as estimating average student reading ability from a dataset of assessment results and student survey responses. A researcher has two estimation methods and must decide which to use for this particular problem.

    The better estimation method is the one that results in less “loss,” which means the estimate will be closer to the ground truth. Consider again the forecasting of ocean currents: Perhaps being off by a few meters per hour isn’t so bad, but being off by many kilometers per hour makes the estimate useless. The ground truth is unknown, though; the scientist is trying to estimate it. Therefore, one can never actually compute the loss of an estimate for their specific data. That’s what makes comparing estimates challenging. The c-value helps a scientist navigate this challenge.

    The c-value equation uses a specific dataset to compute the estimate with each method, and then once more to compute the c-value between the methods. If the c-value is large, it is unlikely that the alternative method is going to be worse and yield less accurate estimates than the original method.

    “In our case, we are assuming that you conservatively want to stay with the default estimator, and you only want to go to the new estimator if you feel very confident about it. With a high c-value, it’s likely that the new estimate is more accurate. If you get a low c-value, you can’t say anything conclusive. You might have actually done better, but you just don’t know,” Broderick explains.

    Probing the theory

    The researchers put that theory to the test by evaluating three real-world data analysis problems.

    For one, they used the c-value to help determine which approach is best for modeling ocean currents, a problem Trippe has been tackling. Accurate models are important for predicting the dispersion of contaminants, like pollution from an oil spill. The team found that estimating ocean currents using multiple scales, one larger and one smaller, likely yields higher accuracy than using only larger scale measurements.

    “Oceans researchers are studying this, and the c-value can provide some statistical ‘oomph’ to support modeling the smaller scale,” Broderick says.

    In another example, the researchers sought to predict violent crime in census tracts in Philadelphia, an application Deshpande has been studying. Using the c-value, they found that one could get better estimates about violent crime rates by incorporating information about census-tract-level nonviolent crime into the analysis. They also used the c-value to show that additionally leveraging violent crime data from neighboring census tracts in the analysis isn’t likely to provide further accuracy improvements.

    “That doesn’t mean there isn’t an improvement, that just means that we don’t feel confident saying that you will get it,” she says.

    Now that they have proven the c-value in theory and shown how it could be used to tackle real-world data problems, the researchers want to expand the measure to more types of data and a wider set of model classes.

    The ultimate goal is to create a measure that is general enough for many more data analysis problems, and while there is still a lot of work to do to realize that objective, Broderick says this is an important and exciting first step in the right direction.

    This research was supported, in part, by an Advanced Research Projects Agency-Energy grant, a National Science Foundation CAREER Award, the Office of Naval Research, and the Wisconsin Alumni Research Foundation. More

  • in

    Putting clear bounds on uncertainty

    In science and technology, there has been a long and steady drive toward improving the accuracy of measurements of all kinds, along with parallel efforts to enhance the resolution of images. An accompanying goal is to reduce the uncertainty in the estimates that can be made, and the inferences drawn, from the data (visual or otherwise) that have been collected. Yet uncertainty can never be wholly eliminated. And since we have to live with it, at least to some extent, there is much to be gained by quantifying the uncertainty as precisely as possible.

    Expressed in other terms, we’d like to know just how uncertain our uncertainty is.

    That issue was taken up in a new study, led by Swami Sankaranarayanan, a postdoc at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and his co-authors — Anastasios Angelopoulos and Stephen Bates of the University of California at Berkeley; Yaniv Romano of Technion, the Israel Institute of Technology; and Phillip Isola, an associate professor of electrical engineering and computer science at MIT. These researchers succeeded not only in obtaining accurate measures of uncertainty, they also found a way to display uncertainty in a manner the average person could grasp.

    Their paper, which was presented in December at the Neural Information Processing Systems Conference in New Orleans, relates to computer vision — a field of artificial intelligence that involves training computers to glean information from digital images. The focus of this research is on images that are partially smudged or corrupted (due to missing pixels), as well as on methods — computer algorithms, in particular — that are designed to uncover the part of the signal that is marred or otherwise concealed. An algorithm of this sort, Sankaranarayanan explains, “takes the blurred image as the input and gives you a clean image as the output” — a process that typically occurs in a couple of steps.

    First, there is an encoder, a kind of neural network specifically trained by the researchers for the task of de-blurring fuzzy images. The encoder takes a distorted image and, from that, creates an abstract (or “latent”) representation of a clean image in a form — consisting of a list of numbers — that is intelligible to a computer but would not make sense to most humans. The next step is a decoder, of which there are a couple of types, that are again usually neural networks. Sankaranarayanan and his colleagues worked with a kind of decoder called a “generative” model. In particular, they used an off-the-shelf version called StyleGAN, which takes the numbers from the encoded representation (of a cat, for instance) as its input and then constructs a complete, cleaned-up image (of that particular cat). So the entire process, including the encoding and decoding stages, yields a crisp picture from an originally muddied rendering.

    But how much faith can someone place in the accuracy of the resultant image? And, as addressed in the December 2022 paper, what is the best way to represent the uncertainty in that image? The standard approach is to create a “saliency map,” which ascribes a probability value — somewhere between 0 and 1 — to indicate the confidence the model has in the correctness of every pixel, taken one at a time. This strategy has a drawback, according to Sankaranarayanan, “because the prediction is performed independently for each pixel. But meaningful objects occur within groups of pixels, not within an individual pixel,” he adds, which is why he and his colleagues are proposing an entirely different way of assessing uncertainty.

    Their approach is centered around the “semantic attributes” of an image — groups of pixels that, when taken together, have meaning, making up a human face, for example, or a dog, or some other recognizable thing. The objective, Sankaranarayanan maintains, “is to estimate uncertainty in a way that relates to the groupings of pixels that humans can readily interpret.”

    Whereas the standard method might yield a single image, constituting the “best guess” as to what the true picture should be, the uncertainty in that representation is normally hard to discern. The new paper argues that for use in the real world, uncertainty should be presented in a way that holds meaning for people who are not experts in machine learning. Rather than producing a single image, the authors have devised a procedure for generating a range of images — each of which might be correct. Moreover, they can set precise bounds on the range, or interval, and provide a probabilistic guarantee that the true depiction lies somewhere within that range. A narrower range can be provided if the user is comfortable with, say, 90 percent certitude, and a narrower range still if more risk is acceptable.

    The authors believe their paper puts forth the first algorithm, designed for a generative model, which can establish uncertainty intervals that relate to meaningful (semantically-interpretable) features of an image and come with “a formal statistical guarantee.” While that is an important milestone, Sankaranarayanan considers it merely a step toward “the ultimate goal. So far, we have been able to do this for simple things, like restoring images of human faces or animals, but we want to extend this approach into more critical domains, such as medical imaging, where our ‘statistical guarantee’ could be especially important.”

    Suppose that the film, or radiograph, of a chest X-ray is blurred, he adds, “and you want to reconstruct the image. If you are given a range of images, you want to know that the true image is contained within that range, so you are not missing anything critical” — information that might reveal whether or not a patient has lung cancer or pneumonia. In fact, Sankaranarayanan and his colleagues have already begun working with a radiologist to see if their algorithm for predicting pneumonia could be useful in a clinical setting.

    Their work may also have relevance in the law enforcement field, he says. “The picture from a surveillance camera may be blurry, and you want to enhance that. Models for doing that already exist, but it is not easy to gauge the uncertainty. And you don’t want to make a mistake in a life-or-death situation.” The tools that he and his colleagues are developing could help identify a guilty person and help exonerate an innocent one as well.

    Much of what we do and many of the things happening in the world around us are shrouded in uncertainty, Sankaranarayanan notes. Therefore, gaining a firmer grasp of that uncertainty could help us in countless ways. For one thing, it can tell us more about exactly what it is we do not know.

    Angelopoulos was supported by the National Science Foundation. Bates was supported by the Foundations of Data Science Institute and the Simons Institute. Romano was supported by the Israel Science Foundation and by a Career Advancement Fellowship from Technion. Sankaranarayanan’s and Isola’s research for this project was sponsored by the U.S. Air Force Research Laboratory and the U.S. Air Force Artificial Intelligence Accelerator and was accomplished under Cooperative Agreement Number FA8750-19-2- 1000. MIT SuperCloud and the Lincoln Laboratory Supercomputing Center also provided computing resources that contributed to the results reported in this work. More

  • in

    Research, education, and connection in the face of war

    When Russian forces invaded Ukraine in February 2022, Tetiana Herasymova had several decisions to make: What should she do, where should she live, and should she take her MITx MicroMasters capstone exams? She had registered for the Statistics and Data Science Program’s final exams just days prior to moving out of her apartment and into a bomb shelter. Although it was difficult to focus on studying and preparations with air horns sounding overhead and uncertainty lingering around her, she was determined to try. “I wouldn’t let the aggressor in the war squash my dreams,” she says.

    A love of research and the desire to improve teaching 

    An early love of solving puzzles and problems for fun piqued Herasymova’s initial interest in mathematics. When she later pursued her PhD in mathematics at Kiev National Taras Shevchenko University, Herasymova’s love of math evolved into a love of research. Throughout Herasymova’s career, she’s worked to close the gap between scientific researchers and educators. Starting as a math tutor at MBA Strategy, a company that prepares Ukrainian leaders for qualifying standardized tests for MBA programs, she was later promoted as the head of their test preparation department. Afterward, she moved on to an equivalent position at ZNOUA, a new project that prepared high school students for Ukraine’s standardized test, and she eventually became ZNOUA’s CEO.

    In 2018, she founded Prosteer, a “self-learning community” of educators who share research, pedagogy, and experience to learn from one another. “It’s really interesting to have a community of teachers from different domains,” she says, speaking of educators and researchers whose specialties range across language, mathematics, physics, music, and more.

    Implementing new pedagogical research in the classroom is often up to educators who seek out studies on an individual basis, Herasymova has found. “Lots of scientists are not practitioners,” she says, and the reverse is also true. She only became more determined to build these connections once she was promoted to head of test preparation at MBA Strategy because she wanted to share more effective pedagogy with the tutors she was mentoring.

    First, Herasymova knew she needed a way to measure the teachers’ effectiveness. She was able to determine whether students who received the company’s tutoring services improved their scores. Moreover, Ukraine keeps an open-access database of national standardized test scores, so anyone could analyze the data in hopes of improving the level of education in the country. She says, “I could do some analytics because I am a mathematician, but I knew I could do much more with this data if I knew data science and machine learning knowledge.”

    That’s why Herasymova sought out the MITx MicroMasters Program in Statistics and Data Science offered by the MIT Institute for Data, Systems, and Society (IDSS). “I wanted to learn the fundamentals so I could join the Learning Analytics domain,” she says. She was looking for a comprehensive program that covered the foundations without being overly basic. “I had some knowledge from the ground, so I could see the deepness of that course,” she says. Because of her background as an instructional designer, she thought the MicroMasters curriculum was well-constructed, calling the variety of videos, practice problems, and homework assignments that encouraged learners to approach the course material in different ways, “a perfect experience.”

    Another benefit of the MicroMasters program was its online format. “I had my usual work, so it was impossible to study in a stationary way,” she says. She found the structure to be more flexible than other programs. “It’s really great that you can construct your course schedule your own way, especially with your own adult life,” she says.

    Determination and support in the midst of war

    When the war first forced Herasymova to flee her apartment, she had already registered to take the exams for her four courses. “It was quite hard to prepare for exams when you could hear explosions outside of the bomb shelter,” she says. She and other Ukranians were invited to postpone their exams until the following session, but the next available testing period wouldn’t be held until October. “It was a hard decision, but I had to allow myself to try,” she says. “For all people in Ukraine, when you don’t know if you’re going to live or die, you try to live in the now. You have to appreciate every moment and what life brings to you. You don’t say, ‘Someday’ — you do it today or tomorrow.”

    In addition to emotional support from her boyfriend, Herasymova had a group of friends who had also enrolled in the program, and they supported each other through study sessions and an ongoing chat. Herasymova’s personal support network helped her accomplish what she set out to do with her MicroMasters program, and in turn, she was able to support her professional network. While Prosteer halted its regular work during the early stages of the war, Herasymova was determined to support the community of educators and scientists that she had built. They continued meeting weekly to exchange ideas as usual. “It’s intrinsic motivation,” she says. They managed to restore all of their activities by October.

    Despite the factors stacked against her, Herasymova’s determination paid off — she passed all of her exams in May, the final step to earning her MicroMasters certificate in statistics and data science. “I just couldn’t believe it,” she says. “It was definitely a bifurcation point. The moment when you realize that you have something to rely on, and that life is just beginning to show all its diversity despite the fact that you live in war.” With her newly minted certificate in hand, Herasymova has continued her research on the effectiveness of educational models — analyzing the data herself — with a summer research program at New York University. 

    The student becomes the master

    After moving seven times between February and October, heading west from Kyiv until most recently settling near the border of Poland, Herasymova hopes she’s moved for the last time. Ukrainian Catholic University offered her a position teaching both mathematics and programming. Before enrolling in the MicroMasters Program in Statistics and Data Science, she had some prior knowledge of programming languages and mathematical algorithms, but she didn’t know Python. She took MITx’s Introduction to Computer Science and Programming Using Python to prepare. “It gave me a huge step forward,” she says. “I learned a lot. Now, not only can I work with Python machine learning models in programming language R, I also have knowledge of the big picture of the purpose and the point to do so.”

    In addition to the skills the MicroMasters Program trained her in, she gained firsthand experience in learning new subjects and exploring topics more deeply. She will be sharing that practice with the community of students and teachers she’s built, plus, she plans on guiding them through this course during the next year. As a continuation of her own educational growth, says she’s looking forward to her next MITx course this year, Data Analysis.

    Herasymova advises that the best way to keep progressing is investing a lot of time. “Adults don’t want to hear this, but you need one or two years,” she says. “Allow yourself to be stupid. If you’re an expert in one domain and want to switch to another, or if you want to understand something new, a lot of people don’t ask questions or don’t ask for help. But from this point, if I don’t know something, I know I should ask for help because that’s the start of learning. With a fixed mindset, you won’t grow.”

    July 2022 MicroMasters Program Joint Completion Celebration. Ukrainian student Tetiana Herasymova, who completed her program amid war in her home country, speaks at 43:55. More

  • in

    Gaining real-world industry experience through Break Through Tech AI at MIT

    Taking what they learned conceptually about artificial intelligence and machine learning (ML) this year, students from across the Greater Boston area had the opportunity to apply their new skills to real-world industry projects as part of an experiential learning opportunity offered through Break Through Tech AI at MIT.

    Hosted by the MIT Schwarzman College of Computing, Break Through Tech AI is a pilot program that aims to bridge the talent gap for women and underrepresented genders in computing fields by providing skills-based training, industry-relevant portfolios, and mentoring to undergraduate students in regional metropolitan areas in order to position them more competitively for careers in data science, machine learning, and artificial intelligence.

    “Programs like Break Through Tech AI gives us opportunities to connect with other students and other institutions, and allows us to bring MIT’s values of diversity, equity, and inclusion to the learning and application in the spaces that we hold,” says Alana Anderson, assistant dean of diversity, equity, and inclusion for the MIT Schwarzman College of Computing.

    The inaugural cohort of 33 undergraduates from 18 Greater Boston-area schools, including Salem State University, Smith College, and Brandeis University, began the free, 18-month program last summer with an eight-week, online skills-based course to learn the basics of AI and machine learning. Students then split into small groups in the fall to collaborate on six machine learning challenge projects presented to them by MathWorks, MIT-IBM Watson AI Lab, and Replicate. The students dedicated five hours or more each week to meet with their teams, teaching assistants, and project advisors, including convening once a month at MIT, while juggling their regular academic course load with other daily activities and responsibilities.

    The challenges gave the undergraduates the chance to help contribute to actual projects that industry organizations are working on and to put their machine learning skills to the test. Members from each organization also served as project advisors, providing encouragement and guidance to the teams throughout.

    “Students are gaining industry experience by working closely with their project advisors,” says Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing and the MIT director of the MIT-IBM Watson AI Lab. “These projects will be an add-on to their machine learning portfolio that they can share as a work example when they’re ready to apply for a job in AI.”

    Over the course of 15 weeks, teams delved into large-scale, real-world datasets to train, test, and evaluate machine learning models in a variety of contexts.

    In December, the students celebrated the fruits of their labor at a showcase event held at MIT in which the six teams gave final presentations on their AI projects. The projects not only allowed the students to build up their AI and machine learning experience, it helped to “improve their knowledge base and skills in presenting their work to both technical and nontechnical audiences,” Oliva says.

    For a project on traffic data analysis, students got trained on MATLAB, a programming and numeric computing platform developed by MathWorks, to create a model that enables decision-making in autonomous driving by predicting future vehicle trajectories. “It’s important to realize that AI is not that intelligent. It’s only as smart as you make it and that’s exactly what we tried to do,” said Brandeis University student Srishti Nautiyal as she introduced her team’s project to the audience. With companies already making autonomous vehicles from planes to trucks a reality, Nautiyal, a physics and mathematics major, shared that her team was also highly motivated to consider the ethical issues of the technology in their model for the safety of passengers, drivers, and pedestrians.

    Using census data to train a model can be tricky because they are often messy and full of holes. In a project on algorithmic fairness for the MIT-IBM Watson AI Lab, the hardest task for the team was having to clean up mountains of unorganized data in a way where they could still gain insights from them. The project — which aimed to create demonstration of fairness applied on a real dataset to evaluate and compare effectiveness of different fairness interventions and fair metric learning techniques — could eventually serve as an educational resource for data scientists interested in learning about fairness in AI and using it in their work, as well as to promote the practice of evaluating the ethical implications of machine learning models in industry.

    Other challenge projects included an ML-assisted whiteboard for nontechnical people to interact with ready-made machine learning models, and a sign language recognition model to help disabled people communicate with others. A team that worked on a visual language app set out to include over 50 languages in their model to increase access for the millions of people that are visually impaired throughout the world. According to the team, similar apps on the market currently only offer up to 23 languages. 

    Throughout the semester, students persisted and demonstrated grit in order to cross the finish line on their projects. With the final presentations marking the conclusion of the fall semester, students will return to MIT in the spring to continue their Break Through Tech AI journey to tackle another round of AI projects. This time, the students will work with Google on new machine learning challenges that will enable them to hone their AI skills even further with an eye toward launching a successful career in AI. More

  • in

    Q&A: A fresh look at data science

    As the leaders of a developing field, data scientists must often deal with a frustratingly slippery question: What is data science, precisely, and what is it good for?

    Alfred Spector is a visiting scholar in the MIT Department of Electrical Engineering and Computer Science (EECS), an influential developer of distributed computing systems and applications, and a successful tech executive with companies including IBM and Google. Along with three co-authors — Peter Norvig at Stanford University and Google, Chris Wiggins at Columbia University and The New York Times, and Jeannette M. Wing at Columbia — Spector recently published “Data Science in Context: Foundations, Challenges, Opportunities” (Cambridge University Press), which provides a broad, conversational overview of the wide-ranging field driving change in sectors ranging from health care to transportation to commerce to entertainment. 

    Here, Spector talks about data-driven life, what makes a good data scientist, and how his book came together during the height of the Covid-19 pandemic.

    Q: One of the most common buzzwords Americans hear is “data-driven,” but many might not know what that term is supposed to mean. Can you unpack it for us?

    A: Data-driven broadly refers to techniques or algorithms powered by data — they either provide insight or reach conclusions, say, a recommendation or a prediction. The algorithms power models which are increasingly woven into the fabric of science, commerce, and life, and they often provide excellent results. The list of their successes is really too long to even begin to list. However, one concern is that the proliferation of data makes it easy for us as students, scientists, or just members of the public to jump to erroneous conclusions. As just one example, our own confirmation biases make us prone to believing some data elements or insights “prove” something we already believe to be true. Additionally, we often tend to see causal relationships where the data only shows correlation. It might seem paradoxical, but data science makes critical reading and analysis of data all the more important.

    Q: What, to your mind, makes a good data scientist?

    A: [In talking to students and colleagues] I optimistically emphasize the power of data science and the importance of gaining the computational, statistical, and machine learning skills to apply it. But, I also remind students that we are obligated to solve problems well. In our book, Chris [Wiggins] paraphrases danah boyd, who says that a successful application of data science is not one that merely meets some technical goal, but one that actually improves lives. More specifically, I exhort practitioners to provide a real solution to problems, or else clearly identify what we are not solving so that people see the limitations of our work. We should be extremely clear so that we do not generate harmful results or lead others to erroneous conclusions. I also remind people that all of us, including scientists and engineers, are human and subject to the same human foibles as everyone else, such as various biases. 

    Q: You discuss Covid-19 in your book. While some short-range models for mortality were very accurate during the heart of the pandemic, you note the failure of long-range models to predict any of 2020’s four major geotemporal Covid waves in the United States. Do you feel Covid was a uniquely hard situation to model? 

    A: Covid was particularly difficult to predict over the long term because of many factors — the virus was changing, human behavior was changing, political entities changed their minds. Also, we didn’t have fine-grained mobility data (perhaps, for good reasons), and we lacked sufficient scientific understanding of the virus, particularly in the first year.

    I think there are many other domains which are similarly difficult. Our book teases out many reasons why data-driven models may not be applicable. Perhaps it’s too difficult to get or hold the necessary data. Perhaps the past doesn’t predict the future. If data models are being used in life-and-death situations, we may not be able to make them sufficiently dependable; this is particularly true as we’ve seen all the motivations that bad actors have to find vulnerabilities. So, as we continue to apply data science, we need to think through all the requirements we have, and the capability of the field to meet them. They often align, but not always. And, as data science seeks to solve problems into ever more important areas such as human health, education, transportation safety, etc., there will be many challenges.

    Q: Let’s talk about the power of good visualization. You mention the popular, early 2000’s Baby Name Voyager website as one that changed your view on the importance of data visualization. Tell us how that happened. 

    A: That website, recently reborn as the Name Grapher, had two characteristics that I thought were brilliant. First, it had a really natural interface, where you type the initial characters of a name and it shows a frequency graph of all the names beginning with those letters, and their popularity over time. Second, it’s so much better than a spreadsheet with 140 columns representing years and rows representing names, despite the fact it contains no extra information. It also provided instantaneous feedback with its display graph dynamically changing as you type. To me, this showed the power of a very simple transformation that is done correctly.

    Q: When you and your co-authors began planning “Data Science In Context,” what did you hope to offer?

    A: We portray present data science as a field that’s already had enormous benefits, that provides even more future opportunities, but one that requires equally enormous care in its use. Referencing the word “context” in the title, we explain that the proper use of data science must consider the specifics of the application, the laws and norms of the society in which the application is used, and even the time period of its deployment. And, importantly for an MIT audience, the practice of data science must go beyond just the data and the model to the careful consideration of an application’s objectives, its security, privacy, abuse, and resilience risks, and even the understandability it conveys to humans. Within this expansive notion of context, we finally explain that data scientists must also carefully consider ethical trade-offs and societal implications.

    Q: How did you keep focus throughout the process?

    A: Much like in open-source projects, I played both the coordinating author role and also the role of overall librarian of all the material, but we all made significant contributions. Chris Wiggins is very knowledgeable on the Belmont principles and applied ethics; he was the major contributor of those sections. Peter Norvig, as the coauthor of a bestselling AI textbook, was particularly involved in the sections on building models and causality. Jeannette Wing worked with me very closely on our seven-element Analysis Rubric and recognized that a checklist for data science practitioners would end up being one of our book’s most important contributions. 

    From a nuts-and-bolts perspective, we wrote the book during Covid, using one large shared Google doc with weekly video conferences. Amazingly enough, Chris, Jeannette, and I didn’t meet in person at all, and Peter and I met only once — sitting outdoors on a wooden bench on the Stanford campus.

    Q: That is an unusual way to write a book! Do you recommend it?

    A: It would be nice to have had more social interaction, but a shared document, at least with a coordinating author, worked pretty well for something up to this size. The benefit is that we always had a single, coherent textual base, not dissimilar to how a programming team works together.

    This is a condensed, edited version of a longer interview that originally appeared on the MIT EECS website. More