More stories

  • in

    Large language models are biased. Can logic help save them?

    Turns out, even language models “think” they’re biased. When prompted in ChatGPT, the response was as follows: “Yes, language models can have biases, because the training data reflects the biases present in society from which that data was collected. For example, gender and racial biases are prevalent in many real-world datasets, and if a language model is trained on that, it can perpetuate and amplify these biases in its predictions.” A well-known but dangerous problem. 

    Humans (typically) can dabble with both logical and stereotypical reasoning when learning. Still, language models mainly mimic the latter, an unfortunate narrative we’ve seen play out ad nauseam when the ability to employ reasoning and critical thinking is absent. So would injecting logic into the fray be enough to mitigate such behavior? 

    Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) had an inkling that it might, so they set off to examine if logic-aware language models could significantly avoid more harmful stereotypes. They trained a language model to predict the relationship between two sentences, based on context and semantic meaning, using a dataset with labels for text snippets detailing if a second phrase “entails,” “contradicts,” or is neutral with respect to the first one. Using this dataset — natural language inference — they found that the newly trained models were significantly less biased than other baselines, without any extra data, data editing, or additional training algorithms.

    For example, with the premise “the person is a doctor” and the hypothesis “the person is masculine,” using these logic-trained models, the relationship would be classified as “neutral,” since there’s no logic that says the person is a man. With more common language models, two sentences might seem to be correlated due to some bias in training data, like “doctor” might be pinged with “masculine,” even when there’s no evidence that the statement is true. 

    At this point, the omnipresent nature of language models is well-known: Applications in natural language processing, speech recognition, conversational AI, and generative tasks abound. While not a nascent field of research, growing pains can take a front seat as they increase in complexity and capability. 

    “Current language models suffer from issues with fairness, computational resources, and privacy,” says MIT CSAIL postdoc Hongyin Luo, the lead author of a new paper about the work. “Many estimates say that the CO2 emission of training a language model can be higher than the lifelong emission of a car. Running these large language models is also very expensive because of the amount of parameters and the computational resources they need. With privacy, state-of-the-art language models developed by places like ChatGPT or GPT-3 have their APIs where you must upload your language, but there’s no place for sensitive information regarding things like health care or finance. To solve these challenges, we proposed a logical language model that we qualitatively measured as fair, is 500 times smaller than the state-of-the-art models, can be deployed locally, and with no human-annotated training samples for downstream tasks. Our model uses 1/400 the parameters compared with the largest language models, has better performance on some tasks, and significantly saves computation resources.” 

    This model, which has 350 million parameters, outperformed some very large-scale language models with 100 billion parameters on logic-language understanding tasks. The team evaluated, for example, popular BERT pretrained language models with their “textual entailment” ones on stereotype, profession, and emotion bias tests. The latter outperformed other models with significantly lower bias, while preserving the language modeling ability. The “fairness” was evaluated with something called ideal context association (iCAT) tests, where higher iCAT scores mean fewer stereotypes. The model had higher than 90 percent iCAT scores, while other strong language understanding models ranged between 40 to 80. 

    Luo wrote the paper alongside MIT Senior Research Scientist James Glass. They will present the work at the Conference of the European Chapter of the Association for Computational Linguistics in Croatia. 

    Unsurprisingly, the original pretrained language models the team examined were teeming with bias, confirmed by a slew of reasoning tests demonstrating how professional and emotion terms are significantly biased to the feminine or masculine words in the gender vocabulary. 

    With professions, a language model (which is biased) thinks that “flight attendant,” “secretary,” and “physician’s assistant” are feminine jobs, while “fisherman,” “lawyer,” and “judge” are masculine. Concerning emotions, a language model thinks that “anxious,” “depressed,” and “devastated” are feminine.

    While we may still be far away from a neutral language model utopia, this research is ongoing in that pursuit. Currently, the model is just for language understanding, so it’s based on reasoning among existing sentences. Unfortunately, it can’t generate sentences for now, so the next step for the researchers would be targeting the uber-popular generative models built with logical learning to ensure more fairness with computational efficiency. 

    “Although stereotypical reasoning is a natural part of human recognition, fairness-aware people conduct reasoning with logic rather than stereotypes when necessary,” says Luo. “We show that language models have similar properties. A language model without explicit logic learning makes plenty of biased reasoning, but adding logic learning can significantly mitigate such behavior. Furthermore, with demonstrated robust zero-shot adaptation ability, the model can be directly deployed to different tasks with more fairness, privacy, and better speed.” More

  • in

    Report: CHIPS Act just the first step in addressing threats to US leadership in advanced computing

    When Liu He, a Chinese economist, politician, and “chip czar,” was tapped to lead the charge in a chipmaking arms race with the United States, his message lingered in the air, leaving behind a dewy glaze of tension: “For our country, technology is not just for growth… it is a matter of survival.”

    Once upon a time, the United States’ early technological prowess positioned the nation to outpace foreign rivals and cultivate a competitive advantage for domestic businesses. Yet, 30 years later, America’s lead in advanced computing is continuing to wane. What happened?

    A new report from an MIT researcher and two colleagues sheds light on the decline in U.S. leadership. The scientists looked at high-level measures to examine the shrinkage: overall capabilities, supercomputers, applied algorithms, and semiconductor manufacturing. Through their analysis, they found that not only has China closed the computing gap with the U.S., but nearly 80 percent of American leaders in the field believe that their Chinese competitors are improving capabilities faster — which, the team says, suggests a “broad threat to U.S. competitiveness.”

    To delve deeply into the fray, the scientists conducted the Advanced Computing Users Survey, sampling 120 top-tier organizations, including universities, national labs, federal agencies, and industry. The team estimates that this group comprises one-third and one-half of all the most significant computing users in the United States.

    “Advanced computing is crucial to scientific improvement, economic growth and the competitiveness of U.S. companies,” says Neil Thompson, director of the FutureTech Research Project at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), who helped lead the study.

    Thompson, who is also a principal investigator at MIT’s Initiative on the Digital Economy, wrote the paper with Chad Evans, executive vice president and secretary and treasurer to the board at the Council on Competitiveness, and Daniel Armbrust, who is the co-founder, initial CEO, and member of the board of directors at Silicon Catalyst and former president of SEMATECH, the semiconductor consortium that developed industry roadmaps.

    The semiconductor, supercomputer, and algorithm bonanza

    Supercomputers — the room-sized, “giant calculators” of the hardware world — are an industry no longer dominated by the United States. Through 2015, about half of the most powerful computers were sitting firmly in the U.S., and China was growing slowly from a very slow base. But in the past six years, China has swiftly caught up, reaching near parity with America.

    This disappearing lead matters. Eighty-four percent of U.S. survey respondents said they’re computationally constrained in running essential programs. “This result was telling, given who our respondents are: the vanguard of American research enterprises and academic institutions with privileged access to advanced national supercomputing resources,” says Thompson. 

    With regards to advanced algorithms, historically, the U.S. has fronted the charge, with two-thirds of all significant improvements dominated by U.S.-born inventors. But in recent decades, U.S. dominance in algorithms has relied on bringing in foreign talent to work in the U.S., which the researchers say is now in jeopardy. China has outpaced the U.S. and many other countries in churning out PhDs in STEM fields since 2007, with one report postulating a near-distant future (2025) where China will be home to nearly twice as many PhDs than in the U.S. China’s rise in algorithms can also be seen with the “Gordon Bell Prize,” an achievement for outstanding work in harnessing the power of supercomputers in varied applications. U.S. winners historically dominated the prize, but China has now equaled or surpassed Americans’ performance in the past five years.

    While the researchers note the CHIPS and Science Act of 2022 is a critical step in re-establishing the foundation of success for advanced computing, they propose recommendations to the U.S. Office of Science and Technology Policy. 

    First, they suggest democratizing access to U.S. supercomputing by building more mid-tier systems that push boundaries for many users, as well as building tools so users scaling up computations can have less up-front resource investment. They also recommend increasing the pool of innovators by funding many more electrical engineers and computer scientists being trained with longer-term US residency incentives and scholarships. Finally, in addition to this new framework, the scientists urge taking advantage of what already exists, via providing the private sector access to experimentation with high-performance computing through supercomputing sites in academia and national labs.

    All that and a bag of chips

    Computing improvements depend on continuous advances in transistor density and performance, but creating robust, new chips necessitate a harmonious blend of design and manufacturing.

    Over the last six years, China was not known as the savants of noteworthy chips. In fact, in the past five decades, the U.S. designed most of them. But this changed in the past six years when China created the HiSilicon Kirin 9000, propelling itself to the international frontier. This success was mainly obtained through partnerships with leading global chip designers that began in the 2000s. Now, China now has 14 companies among the world’s top 50 fabless designers. A decade ago, there was only one. 

    Competitive semiconductor manufacturing has been more mixed, where U.S.-led policies and internal execution issues have slowed China’s rise, but as of July 2022, the Semiconductor Manufacturing International Corporation (SMIC) has evidence of 7 nanometer logic, which was not expected until much later. However, with extreme ultraviolet export restrictions, progress below 7 nm means domestic technology development would be expensive. Currently, China is only at parity or better in two out of 12 segments of the semiconductor supply chain. Still, with government policy and investments, the team expects a whopping increase to seven segments in 10 years. So, for the moment, the U.S. retains leadership in hardware manufacturing, but with fewer dimensions of advantage.

    The authors recommend that the White House Office of Science and Technology Policy work with key national agencies, such as the U.S. Department of Defense, U.S. Department of Energy, and the National Science Foundation, to define initiatives to build the hardware and software systems needed for important computing paradigms and workloads critical for economic and security goals. “It is crucial that American enterprises can get the benefit of faster computers,” says Thompson. “With Moore’s Law slowing down, the best way to do this is to create a portfolio of specialized chips (or “accelerators”) that are customized to our needs.”

    The scientists further believe that to lead the next generation of computing, four areas must be addressed. First, by issuing grand challenges to the CHIPS Act National Semiconductor Technology Center, researchers and startups would be motivated to invest in research and development and to seek startup capital for new technologies in areas such as spintronics, neuromorphics, optical and quantum computing, and optical interconnect fabrics. By supporting allies in passing similar acts, overall investment in these technologies would increase, and supply chains would become more aligned and secure. Establishing test beds for researchers to test algorithms on new computing architectures and hardware would provide an essential platform for innovation and discovery. Finally, planning for post-exascale systems that achieve higher levels of performance through next-generation advances would ensure that current commercial technologies don’t limit future computing systems.

    “The advanced computing landscape is in rapid flux — technologically, economically, and politically, with both new opportunities for innovation and rising global rivalries,” says Daniel Reed, Presidential Professor and professor of computer science and electrical and computer engineering at the University of Utah. “The transformational insights from both deep learning and computational modeling depend on both continued semiconductor advances and their instantiation in leading edge, large-scale computing systems — hyperscale clouds and high-performance computing systems. Although the U.S. has historically led the world in both advanced semiconductors and high-performance computing, other nations have recognized that these capabilities are integral to 21st century economic competitiveness and national security, and they are investing heavily.”

    The research was funded, in part, through Thompson’s grant from Good Ventures, which supports his FutureTech Research Group. The paper is being published by the Georgetown Public Policy Review. More

  • in

    A new way for quantum computing systems to keep their cool

    Heat causes errors in the qubits that are the building blocks of a quantum computer, so quantum systems are typically kept inside refrigerators that keep the temperature just above absolute zero (-459 degrees Fahrenheit).

    But quantum computers need to communicate with electronics outside the refrigerator, in a room-temperature environment. The metal cables that connect these electronics bring heat into the refrigerator, which has to work even harder and draw extra power to keep the system cold. Plus, more qubits require more cables, so the size of a quantum system is limited by how much heat the fridge can remove.

    To overcome this challenge, an interdisciplinary team of MIT researchers has developed a wireless communication system that enables a quantum computer to send and receive data to and from electronics outside the refrigerator using high-speed terahertz waves.

    A transceiver chip placed inside the fridge can receive and transmit data. Terahertz waves generated outside the refrigerator are beamed in through a glass window. Data encoded onto these waves can be received by the chip. That chip also acts as a mirror, delivering data from the qubits on the terahertz waves it reflects to their source.

    This reflection process also bounces back much of the power sent into the fridge, so the process generates only a minimal amount of heat. The contactless communication system consumes up to 10 times less power than systems with metal cables.

    “By having this reflection mode, you really save the power consumption inside the fridge and leave all those dirty jobs on the outside. While this is still just a preliminary prototype and we have some room to improve, even at this point, we have shown low power consumption inside the fridge that is already better than metallic cables. I believe this could be a way to build largescale quantum systems,” says senior author Ruonan Han, an associate professor in the Department of Electrical Engineering and Computer Sciences (EECS) who leads the Terahertz Integrated Electronics Group.

    Han and his team, with expertise in terahertz waves and electronic devices, joined forces with associate professor Dirk Englund and the Quantum Photonics Laboratory team, who provided quantum engineering expertise and joined in conducting the cryogenic experiments.

    Joining Han and Englund on the paper are first author and EECS graduate student Jinchen Wang; Mohamed Ibrahim PhD ’21; Isaac Harris, a graduate student in the Quantum Photonics Laboratory; Nathan M. Monroe PhD ’22; Wasiq Khan PhD ’22; and Xiang Yi, a former postdoc who is now a professor at the South China University of Technology. The paper will be presented at the International Solid-States Circuits Conference.

    Tiny mirrors

    The researchers’ square transceiver chip, measuring about 2 millimeters on each side, is placed on a quantum computer inside the refrigerator, which is called a cryostat because it maintains cryogenic temperatures. These super-cold temperatures don’t damage the chip; in fact, they enable it to run more efficiently than it would at room temperature.

    The chip sends and receives data from a terahertz wave source outside the cryostat using a passive communication process known as backscatter, which involves reflections. An array of antennas on top of the chip, each of which is only about 200 micrometers in size, act as tiny mirrors. These mirrors can be “turned on” to reflect waves or “turned off.”

    The terahertz wave generation source encodes data onto the waves it sends into the cryostat, and the antennas in their “off” state can receive those waves and the data they carry.

    When the tiny mirrors are turned on, they can be set so they either reflect a wave in its current form or invert its phase before bouncing it back. If the reflected wave has the same phase, that represents a 0, but if the phase is inverted, that represents a 1. Electronics outside the cryostat can interpret those binary signals to decode the data.

    “This backscatter technology is not new. For instance, RFIDs are based on backscatter communication. We borrow that idea and bring it into this very unique scenario, and I think this leads to a good combination of all these technologies,” Han says.

    Terahertz advantages

    The data are transmitted using high-speed terahertz waves, which are located on the electromagnetic spectrum between radio waves and infrared light.

    Because terahertz waves are much smaller than radio waves, the chip and its antennas can be smaller, too, which would make the device easier to manufacture at scale. Terahertz waves also have higher frequencies than radio waves, so they can transmit data much faster and move larger amounts of information.

    But because terahertz waves have lower frequencies than the light waves used in photonic systems, the terahertz waves carry less quantum noise, which leads to less interference with quantum processors.

    Importantly, the transceiver chip and terahertz link can be fully constructed with standard fabrication processes on a CMOS chip, so they can be integrated into many current systems and techniques.

    “CMOS compatibility is important. For example, one terahertz link could deliver a large amount of data and feed it to another cryo-CMOS controller, which can split the signal to control multiple qubits simultaneously, so we can reduce the quantity of RF cables dramatically. This is very promising.” Wang says.

    The researchers were able to transmit data at 4 gigabits per second with their prototype, but Han says the sky is nearly the limit when it comes to boosting that speed. The downlink of the contactless system posed about 10 times less heat load than a system with metallic cables, and the temperature of the cryostat fluctuated up to a few millidegrees during experiments.

    Now that the researchers have demonstrated this wireless technology, they want to improve the system’s speed and efficiency using special terahertz fibers, which are only a few hundred micrometers wide. Han’s group has shown that these plastic wires can transmit data at a rate of 100 gigabits per second and have much better thermal insulation than fatter, metal cables.

    The researchers also want to refine the design of their transceiver to improve scalability and continue boosting its energy efficiency. Generating terahertz waves requires a lot of power, but Han’s group is studying more efficient methods that utilize low-cost chips. Incorporating this technology into the system could make the device more cost-effective.

    The transceiver chip was fabricated through the Intel University Shuttle Program. More

  • in

    When should data scientists try a new technique?

    If a scientist wanted to forecast ocean currents to understand how pollution travels after an oil spill, she could use a common approach that looks at currents traveling between 10 and 200 kilometers. Or, she could choose a newer model that also includes shorter currents. This might be more accurate, but it could also require learning new software or running new computational experiments. How to know if it will be worth the time, cost, and effort to use the new method?

    A new approach developed by MIT researchers could help data scientists answer this question, whether they are looking at statistics on ocean currents, violent crime, children’s reading ability, or any number of other types of datasets.

    The team created a new measure, known as the “c-value,” that helps users choose between techniques based on the chance that a new method is more accurate for a specific dataset. This measure answers the question “is it likely that the new method is more accurate for this data than the common approach?”

    Traditionally, statisticians compare methods by averaging a method’s accuracy across all possible datasets. But just because a new method is better for all datasets on average doesn’t mean it will actually provide a better estimate using one particular dataset. Averages are not application-specific.

    So, researchers from MIT and elsewhere created the c-value, which is a dataset-specific tool. A high c-value means it is unlikely a new method will be less accurate than the original method on a specific data problem.

    In their proof-of-concept paper, the researchers describe and evaluate the c-value using real-world data analysis problems: modeling ocean currents, estimating violent crime in neighborhoods, and approximating student reading ability at schools. They show how the c-value could help statisticians and data analysts achieve more accurate results by indicating when to use alternative estimation methods they otherwise might have ignored.

    “What we are trying to do with this particular work is come up with something that is data specific. The classical notion of risk is really natural for someone developing a new method. That person wants their method to work well for all of their users on average. But a user of a method wants something that will work on their individual problem. We’ve shown that the c-value is a very practical proof-of-concept in that direction,” says senior author Tamara Broderick, an associate professor in the Department of Electrical Engineering and Computer Science (EECS) and a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society.

    She’s joined on the paper by Brian Trippe PhD ’22, a former graduate student in Broderick’s group who is now a postdoc at Columbia University; and Sameer Deshpande ’13, a former postdoc in Broderick’s group who is now an assistant professor at the University of Wisconsin at Madison. An accepted version of the paper is posted online in the Journal of the American Statistical Association.

    Evaluating estimators

    The c-value is designed to help with data problems in which researchers seek to estimate an unknown parameter using a dataset, such as estimating average student reading ability from a dataset of assessment results and student survey responses. A researcher has two estimation methods and must decide which to use for this particular problem.

    The better estimation method is the one that results in less “loss,” which means the estimate will be closer to the ground truth. Consider again the forecasting of ocean currents: Perhaps being off by a few meters per hour isn’t so bad, but being off by many kilometers per hour makes the estimate useless. The ground truth is unknown, though; the scientist is trying to estimate it. Therefore, one can never actually compute the loss of an estimate for their specific data. That’s what makes comparing estimates challenging. The c-value helps a scientist navigate this challenge.

    The c-value equation uses a specific dataset to compute the estimate with each method, and then once more to compute the c-value between the methods. If the c-value is large, it is unlikely that the alternative method is going to be worse and yield less accurate estimates than the original method.

    “In our case, we are assuming that you conservatively want to stay with the default estimator, and you only want to go to the new estimator if you feel very confident about it. With a high c-value, it’s likely that the new estimate is more accurate. If you get a low c-value, you can’t say anything conclusive. You might have actually done better, but you just don’t know,” Broderick explains.

    Probing the theory

    The researchers put that theory to the test by evaluating three real-world data analysis problems.

    For one, they used the c-value to help determine which approach is best for modeling ocean currents, a problem Trippe has been tackling. Accurate models are important for predicting the dispersion of contaminants, like pollution from an oil spill. The team found that estimating ocean currents using multiple scales, one larger and one smaller, likely yields higher accuracy than using only larger scale measurements.

    “Oceans researchers are studying this, and the c-value can provide some statistical ‘oomph’ to support modeling the smaller scale,” Broderick says.

    In another example, the researchers sought to predict violent crime in census tracts in Philadelphia, an application Deshpande has been studying. Using the c-value, they found that one could get better estimates about violent crime rates by incorporating information about census-tract-level nonviolent crime into the analysis. They also used the c-value to show that additionally leveraging violent crime data from neighboring census tracts in the analysis isn’t likely to provide further accuracy improvements.

    “That doesn’t mean there isn’t an improvement, that just means that we don’t feel confident saying that you will get it,” she says.

    Now that they have proven the c-value in theory and shown how it could be used to tackle real-world data problems, the researchers want to expand the measure to more types of data and a wider set of model classes.

    The ultimate goal is to create a measure that is general enough for many more data analysis problems, and while there is still a lot of work to do to realize that objective, Broderick says this is an important and exciting first step in the right direction.

    This research was supported, in part, by an Advanced Research Projects Agency-Energy grant, a National Science Foundation CAREER Award, the Office of Naval Research, and the Wisconsin Alumni Research Foundation. More

  • in

    Putting clear bounds on uncertainty

    In science and technology, there has been a long and steady drive toward improving the accuracy of measurements of all kinds, along with parallel efforts to enhance the resolution of images. An accompanying goal is to reduce the uncertainty in the estimates that can be made, and the inferences drawn, from the data (visual or otherwise) that have been collected. Yet uncertainty can never be wholly eliminated. And since we have to live with it, at least to some extent, there is much to be gained by quantifying the uncertainty as precisely as possible.

    Expressed in other terms, we’d like to know just how uncertain our uncertainty is.

    That issue was taken up in a new study, led by Swami Sankaranarayanan, a postdoc at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and his co-authors — Anastasios Angelopoulos and Stephen Bates of the University of California at Berkeley; Yaniv Romano of Technion, the Israel Institute of Technology; and Phillip Isola, an associate professor of electrical engineering and computer science at MIT. These researchers succeeded not only in obtaining accurate measures of uncertainty, they also found a way to display uncertainty in a manner the average person could grasp.

    Their paper, which was presented in December at the Neural Information Processing Systems Conference in New Orleans, relates to computer vision — a field of artificial intelligence that involves training computers to glean information from digital images. The focus of this research is on images that are partially smudged or corrupted (due to missing pixels), as well as on methods — computer algorithms, in particular — that are designed to uncover the part of the signal that is marred or otherwise concealed. An algorithm of this sort, Sankaranarayanan explains, “takes the blurred image as the input and gives you a clean image as the output” — a process that typically occurs in a couple of steps.

    First, there is an encoder, a kind of neural network specifically trained by the researchers for the task of de-blurring fuzzy images. The encoder takes a distorted image and, from that, creates an abstract (or “latent”) representation of a clean image in a form — consisting of a list of numbers — that is intelligible to a computer but would not make sense to most humans. The next step is a decoder, of which there are a couple of types, that are again usually neural networks. Sankaranarayanan and his colleagues worked with a kind of decoder called a “generative” model. In particular, they used an off-the-shelf version called StyleGAN, which takes the numbers from the encoded representation (of a cat, for instance) as its input and then constructs a complete, cleaned-up image (of that particular cat). So the entire process, including the encoding and decoding stages, yields a crisp picture from an originally muddied rendering.

    But how much faith can someone place in the accuracy of the resultant image? And, as addressed in the December 2022 paper, what is the best way to represent the uncertainty in that image? The standard approach is to create a “saliency map,” which ascribes a probability value — somewhere between 0 and 1 — to indicate the confidence the model has in the correctness of every pixel, taken one at a time. This strategy has a drawback, according to Sankaranarayanan, “because the prediction is performed independently for each pixel. But meaningful objects occur within groups of pixels, not within an individual pixel,” he adds, which is why he and his colleagues are proposing an entirely different way of assessing uncertainty.

    Their approach is centered around the “semantic attributes” of an image — groups of pixels that, when taken together, have meaning, making up a human face, for example, or a dog, or some other recognizable thing. The objective, Sankaranarayanan maintains, “is to estimate uncertainty in a way that relates to the groupings of pixels that humans can readily interpret.”

    Whereas the standard method might yield a single image, constituting the “best guess” as to what the true picture should be, the uncertainty in that representation is normally hard to discern. The new paper argues that for use in the real world, uncertainty should be presented in a way that holds meaning for people who are not experts in machine learning. Rather than producing a single image, the authors have devised a procedure for generating a range of images — each of which might be correct. Moreover, they can set precise bounds on the range, or interval, and provide a probabilistic guarantee that the true depiction lies somewhere within that range. A narrower range can be provided if the user is comfortable with, say, 90 percent certitude, and a narrower range still if more risk is acceptable.

    The authors believe their paper puts forth the first algorithm, designed for a generative model, which can establish uncertainty intervals that relate to meaningful (semantically-interpretable) features of an image and come with “a formal statistical guarantee.” While that is an important milestone, Sankaranarayanan considers it merely a step toward “the ultimate goal. So far, we have been able to do this for simple things, like restoring images of human faces or animals, but we want to extend this approach into more critical domains, such as medical imaging, where our ‘statistical guarantee’ could be especially important.”

    Suppose that the film, or radiograph, of a chest X-ray is blurred, he adds, “and you want to reconstruct the image. If you are given a range of images, you want to know that the true image is contained within that range, so you are not missing anything critical” — information that might reveal whether or not a patient has lung cancer or pneumonia. In fact, Sankaranarayanan and his colleagues have already begun working with a radiologist to see if their algorithm for predicting pneumonia could be useful in a clinical setting.

    Their work may also have relevance in the law enforcement field, he says. “The picture from a surveillance camera may be blurry, and you want to enhance that. Models for doing that already exist, but it is not easy to gauge the uncertainty. And you don’t want to make a mistake in a life-or-death situation.” The tools that he and his colleagues are developing could help identify a guilty person and help exonerate an innocent one as well.

    Much of what we do and many of the things happening in the world around us are shrouded in uncertainty, Sankaranarayanan notes. Therefore, gaining a firmer grasp of that uncertainty could help us in countless ways. For one thing, it can tell us more about exactly what it is we do not know.

    Angelopoulos was supported by the National Science Foundation. Bates was supported by the Foundations of Data Science Institute and the Simons Institute. Romano was supported by the Israel Science Foundation and by a Career Advancement Fellowship from Technion. Sankaranarayanan’s and Isola’s research for this project was sponsored by the U.S. Air Force Research Laboratory and the U.S. Air Force Artificial Intelligence Accelerator and was accomplished under Cooperative Agreement Number FA8750-19-2- 1000. MIT SuperCloud and the Lincoln Laboratory Supercomputing Center also provided computing resources that contributed to the results reported in this work. More

  • in

    Research, education, and connection in the face of war

    When Russian forces invaded Ukraine in February 2022, Tetiana Herasymova had several decisions to make: What should she do, where should she live, and should she take her MITx MicroMasters capstone exams? She had registered for the Statistics and Data Science Program’s final exams just days prior to moving out of her apartment and into a bomb shelter. Although it was difficult to focus on studying and preparations with air horns sounding overhead and uncertainty lingering around her, she was determined to try. “I wouldn’t let the aggressor in the war squash my dreams,” she says.

    A love of research and the desire to improve teaching 

    An early love of solving puzzles and problems for fun piqued Herasymova’s initial interest in mathematics. When she later pursued her PhD in mathematics at Kiev National Taras Shevchenko University, Herasymova’s love of math evolved into a love of research. Throughout Herasymova’s career, she’s worked to close the gap between scientific researchers and educators. Starting as a math tutor at MBA Strategy, a company that prepares Ukrainian leaders for qualifying standardized tests for MBA programs, she was later promoted as the head of their test preparation department. Afterward, she moved on to an equivalent position at ZNOUA, a new project that prepared high school students for Ukraine’s standardized test, and she eventually became ZNOUA’s CEO.

    In 2018, she founded Prosteer, a “self-learning community” of educators who share research, pedagogy, and experience to learn from one another. “It’s really interesting to have a community of teachers from different domains,” she says, speaking of educators and researchers whose specialties range across language, mathematics, physics, music, and more.

    Implementing new pedagogical research in the classroom is often up to educators who seek out studies on an individual basis, Herasymova has found. “Lots of scientists are not practitioners,” she says, and the reverse is also true. She only became more determined to build these connections once she was promoted to head of test preparation at MBA Strategy because she wanted to share more effective pedagogy with the tutors she was mentoring.

    First, Herasymova knew she needed a way to measure the teachers’ effectiveness. She was able to determine whether students who received the company’s tutoring services improved their scores. Moreover, Ukraine keeps an open-access database of national standardized test scores, so anyone could analyze the data in hopes of improving the level of education in the country. She says, “I could do some analytics because I am a mathematician, but I knew I could do much more with this data if I knew data science and machine learning knowledge.”

    That’s why Herasymova sought out the MITx MicroMasters Program in Statistics and Data Science offered by the MIT Institute for Data, Systems, and Society (IDSS). “I wanted to learn the fundamentals so I could join the Learning Analytics domain,” she says. She was looking for a comprehensive program that covered the foundations without being overly basic. “I had some knowledge from the ground, so I could see the deepness of that course,” she says. Because of her background as an instructional designer, she thought the MicroMasters curriculum was well-constructed, calling the variety of videos, practice problems, and homework assignments that encouraged learners to approach the course material in different ways, “a perfect experience.”

    Another benefit of the MicroMasters program was its online format. “I had my usual work, so it was impossible to study in a stationary way,” she says. She found the structure to be more flexible than other programs. “It’s really great that you can construct your course schedule your own way, especially with your own adult life,” she says.

    Determination and support in the midst of war

    When the war first forced Herasymova to flee her apartment, she had already registered to take the exams for her four courses. “It was quite hard to prepare for exams when you could hear explosions outside of the bomb shelter,” she says. She and other Ukranians were invited to postpone their exams until the following session, but the next available testing period wouldn’t be held until October. “It was a hard decision, but I had to allow myself to try,” she says. “For all people in Ukraine, when you don’t know if you’re going to live or die, you try to live in the now. You have to appreciate every moment and what life brings to you. You don’t say, ‘Someday’ — you do it today or tomorrow.”

    In addition to emotional support from her boyfriend, Herasymova had a group of friends who had also enrolled in the program, and they supported each other through study sessions and an ongoing chat. Herasymova’s personal support network helped her accomplish what she set out to do with her MicroMasters program, and in turn, she was able to support her professional network. While Prosteer halted its regular work during the early stages of the war, Herasymova was determined to support the community of educators and scientists that she had built. They continued meeting weekly to exchange ideas as usual. “It’s intrinsic motivation,” she says. They managed to restore all of their activities by October.

    Despite the factors stacked against her, Herasymova’s determination paid off — she passed all of her exams in May, the final step to earning her MicroMasters certificate in statistics and data science. “I just couldn’t believe it,” she says. “It was definitely a bifurcation point. The moment when you realize that you have something to rely on, and that life is just beginning to show all its diversity despite the fact that you live in war.” With her newly minted certificate in hand, Herasymova has continued her research on the effectiveness of educational models — analyzing the data herself — with a summer research program at New York University. 

    The student becomes the master

    After moving seven times between February and October, heading west from Kyiv until most recently settling near the border of Poland, Herasymova hopes she’s moved for the last time. Ukrainian Catholic University offered her a position teaching both mathematics and programming. Before enrolling in the MicroMasters Program in Statistics and Data Science, she had some prior knowledge of programming languages and mathematical algorithms, but she didn’t know Python. She took MITx’s Introduction to Computer Science and Programming Using Python to prepare. “It gave me a huge step forward,” she says. “I learned a lot. Now, not only can I work with Python machine learning models in programming language R, I also have knowledge of the big picture of the purpose and the point to do so.”

    In addition to the skills the MicroMasters Program trained her in, she gained firsthand experience in learning new subjects and exploring topics more deeply. She will be sharing that practice with the community of students and teachers she’s built, plus, she plans on guiding them through this course during the next year. As a continuation of her own educational growth, says she’s looking forward to her next MITx course this year, Data Analysis.

    Herasymova advises that the best way to keep progressing is investing a lot of time. “Adults don’t want to hear this, but you need one or two years,” she says. “Allow yourself to be stupid. If you’re an expert in one domain and want to switch to another, or if you want to understand something new, a lot of people don’t ask questions or don’t ask for help. But from this point, if I don’t know something, I know I should ask for help because that’s the start of learning. With a fixed mindset, you won’t grow.”

    July 2022 MicroMasters Program Joint Completion Celebration. Ukrainian student Tetiana Herasymova, who completed her program amid war in her home country, speaks at 43:55. More

  • in

    Gaining real-world industry experience through Break Through Tech AI at MIT

    Taking what they learned conceptually about artificial intelligence and machine learning (ML) this year, students from across the Greater Boston area had the opportunity to apply their new skills to real-world industry projects as part of an experiential learning opportunity offered through Break Through Tech AI at MIT.

    Hosted by the MIT Schwarzman College of Computing, Break Through Tech AI is a pilot program that aims to bridge the talent gap for women and underrepresented genders in computing fields by providing skills-based training, industry-relevant portfolios, and mentoring to undergraduate students in regional metropolitan areas in order to position them more competitively for careers in data science, machine learning, and artificial intelligence.

    “Programs like Break Through Tech AI gives us opportunities to connect with other students and other institutions, and allows us to bring MIT’s values of diversity, equity, and inclusion to the learning and application in the spaces that we hold,” says Alana Anderson, assistant dean of diversity, equity, and inclusion for the MIT Schwarzman College of Computing.

    The inaugural cohort of 33 undergraduates from 18 Greater Boston-area schools, including Salem State University, Smith College, and Brandeis University, began the free, 18-month program last summer with an eight-week, online skills-based course to learn the basics of AI and machine learning. Students then split into small groups in the fall to collaborate on six machine learning challenge projects presented to them by MathWorks, MIT-IBM Watson AI Lab, and Replicate. The students dedicated five hours or more each week to meet with their teams, teaching assistants, and project advisors, including convening once a month at MIT, while juggling their regular academic course load with other daily activities and responsibilities.

    The challenges gave the undergraduates the chance to help contribute to actual projects that industry organizations are working on and to put their machine learning skills to the test. Members from each organization also served as project advisors, providing encouragement and guidance to the teams throughout.

    “Students are gaining industry experience by working closely with their project advisors,” says Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing and the MIT director of the MIT-IBM Watson AI Lab. “These projects will be an add-on to their machine learning portfolio that they can share as a work example when they’re ready to apply for a job in AI.”

    Over the course of 15 weeks, teams delved into large-scale, real-world datasets to train, test, and evaluate machine learning models in a variety of contexts.

    In December, the students celebrated the fruits of their labor at a showcase event held at MIT in which the six teams gave final presentations on their AI projects. The projects not only allowed the students to build up their AI and machine learning experience, it helped to “improve their knowledge base and skills in presenting their work to both technical and nontechnical audiences,” Oliva says.

    For a project on traffic data analysis, students got trained on MATLAB, a programming and numeric computing platform developed by MathWorks, to create a model that enables decision-making in autonomous driving by predicting future vehicle trajectories. “It’s important to realize that AI is not that intelligent. It’s only as smart as you make it and that’s exactly what we tried to do,” said Brandeis University student Srishti Nautiyal as she introduced her team’s project to the audience. With companies already making autonomous vehicles from planes to trucks a reality, Nautiyal, a physics and mathematics major, shared that her team was also highly motivated to consider the ethical issues of the technology in their model for the safety of passengers, drivers, and pedestrians.

    Using census data to train a model can be tricky because they are often messy and full of holes. In a project on algorithmic fairness for the MIT-IBM Watson AI Lab, the hardest task for the team was having to clean up mountains of unorganized data in a way where they could still gain insights from them. The project — which aimed to create demonstration of fairness applied on a real dataset to evaluate and compare effectiveness of different fairness interventions and fair metric learning techniques — could eventually serve as an educational resource for data scientists interested in learning about fairness in AI and using it in their work, as well as to promote the practice of evaluating the ethical implications of machine learning models in industry.

    Other challenge projects included an ML-assisted whiteboard for nontechnical people to interact with ready-made machine learning models, and a sign language recognition model to help disabled people communicate with others. A team that worked on a visual language app set out to include over 50 languages in their model to increase access for the millions of people that are visually impaired throughout the world. According to the team, similar apps on the market currently only offer up to 23 languages. 

    Throughout the semester, students persisted and demonstrated grit in order to cross the finish line on their projects. With the final presentations marking the conclusion of the fall semester, students will return to MIT in the spring to continue their Break Through Tech AI journey to tackle another round of AI projects. This time, the students will work with Google on new machine learning challenges that will enable them to hone their AI skills even further with an eye toward launching a successful career in AI. More

  • in

    Q&A: A fresh look at data science

    As the leaders of a developing field, data scientists must often deal with a frustratingly slippery question: What is data science, precisely, and what is it good for?

    Alfred Spector is a visiting scholar in the MIT Department of Electrical Engineering and Computer Science (EECS), an influential developer of distributed computing systems and applications, and a successful tech executive with companies including IBM and Google. Along with three co-authors — Peter Norvig at Stanford University and Google, Chris Wiggins at Columbia University and The New York Times, and Jeannette M. Wing at Columbia — Spector recently published “Data Science in Context: Foundations, Challenges, Opportunities” (Cambridge University Press), which provides a broad, conversational overview of the wide-ranging field driving change in sectors ranging from health care to transportation to commerce to entertainment. 

    Here, Spector talks about data-driven life, what makes a good data scientist, and how his book came together during the height of the Covid-19 pandemic.

    Q: One of the most common buzzwords Americans hear is “data-driven,” but many might not know what that term is supposed to mean. Can you unpack it for us?

    A: Data-driven broadly refers to techniques or algorithms powered by data — they either provide insight or reach conclusions, say, a recommendation or a prediction. The algorithms power models which are increasingly woven into the fabric of science, commerce, and life, and they often provide excellent results. The list of their successes is really too long to even begin to list. However, one concern is that the proliferation of data makes it easy for us as students, scientists, or just members of the public to jump to erroneous conclusions. As just one example, our own confirmation biases make us prone to believing some data elements or insights “prove” something we already believe to be true. Additionally, we often tend to see causal relationships where the data only shows correlation. It might seem paradoxical, but data science makes critical reading and analysis of data all the more important.

    Q: What, to your mind, makes a good data scientist?

    A: [In talking to students and colleagues] I optimistically emphasize the power of data science and the importance of gaining the computational, statistical, and machine learning skills to apply it. But, I also remind students that we are obligated to solve problems well. In our book, Chris [Wiggins] paraphrases danah boyd, who says that a successful application of data science is not one that merely meets some technical goal, but one that actually improves lives. More specifically, I exhort practitioners to provide a real solution to problems, or else clearly identify what we are not solving so that people see the limitations of our work. We should be extremely clear so that we do not generate harmful results or lead others to erroneous conclusions. I also remind people that all of us, including scientists and engineers, are human and subject to the same human foibles as everyone else, such as various biases. 

    Q: You discuss Covid-19 in your book. While some short-range models for mortality were very accurate during the heart of the pandemic, you note the failure of long-range models to predict any of 2020’s four major geotemporal Covid waves in the United States. Do you feel Covid was a uniquely hard situation to model? 

    A: Covid was particularly difficult to predict over the long term because of many factors — the virus was changing, human behavior was changing, political entities changed their minds. Also, we didn’t have fine-grained mobility data (perhaps, for good reasons), and we lacked sufficient scientific understanding of the virus, particularly in the first year.

    I think there are many other domains which are similarly difficult. Our book teases out many reasons why data-driven models may not be applicable. Perhaps it’s too difficult to get or hold the necessary data. Perhaps the past doesn’t predict the future. If data models are being used in life-and-death situations, we may not be able to make them sufficiently dependable; this is particularly true as we’ve seen all the motivations that bad actors have to find vulnerabilities. So, as we continue to apply data science, we need to think through all the requirements we have, and the capability of the field to meet them. They often align, but not always. And, as data science seeks to solve problems into ever more important areas such as human health, education, transportation safety, etc., there will be many challenges.

    Q: Let’s talk about the power of good visualization. You mention the popular, early 2000’s Baby Name Voyager website as one that changed your view on the importance of data visualization. Tell us how that happened. 

    A: That website, recently reborn as the Name Grapher, had two characteristics that I thought were brilliant. First, it had a really natural interface, where you type the initial characters of a name and it shows a frequency graph of all the names beginning with those letters, and their popularity over time. Second, it’s so much better than a spreadsheet with 140 columns representing years and rows representing names, despite the fact it contains no extra information. It also provided instantaneous feedback with its display graph dynamically changing as you type. To me, this showed the power of a very simple transformation that is done correctly.

    Q: When you and your co-authors began planning “Data Science In Context,” what did you hope to offer?

    A: We portray present data science as a field that’s already had enormous benefits, that provides even more future opportunities, but one that requires equally enormous care in its use. Referencing the word “context” in the title, we explain that the proper use of data science must consider the specifics of the application, the laws and norms of the society in which the application is used, and even the time period of its deployment. And, importantly for an MIT audience, the practice of data science must go beyond just the data and the model to the careful consideration of an application’s objectives, its security, privacy, abuse, and resilience risks, and even the understandability it conveys to humans. Within this expansive notion of context, we finally explain that data scientists must also carefully consider ethical trade-offs and societal implications.

    Q: How did you keep focus throughout the process?

    A: Much like in open-source projects, I played both the coordinating author role and also the role of overall librarian of all the material, but we all made significant contributions. Chris Wiggins is very knowledgeable on the Belmont principles and applied ethics; he was the major contributor of those sections. Peter Norvig, as the coauthor of a bestselling AI textbook, was particularly involved in the sections on building models and causality. Jeannette Wing worked with me very closely on our seven-element Analysis Rubric and recognized that a checklist for data science practitioners would end up being one of our book’s most important contributions. 

    From a nuts-and-bolts perspective, we wrote the book during Covid, using one large shared Google doc with weekly video conferences. Amazingly enough, Chris, Jeannette, and I didn’t meet in person at all, and Peter and I met only once — sitting outdoors on a wooden bench on the Stanford campus.

    Q: That is an unusual way to write a book! Do you recommend it?

    A: It would be nice to have had more social interaction, but a shared document, at least with a coordinating author, worked pretty well for something up to this size. The benefit is that we always had a single, coherent textual base, not dissimilar to how a programming team works together.

    This is a condensed, edited version of a longer interview that originally appeared on the MIT EECS website. More