More stories

  • in

    3 Questions: A new PhD program from the Center for Computational Science and Engineering

    This fall, the Center for Computational Science and Engineering (CCSE), an academic unit in the MIT Schwarzman College of Computing, is introducing a new standalone PhD degree program that will enable students to pursue research in cross-cutting methodological aspects of computational science and engineering. The launch follows approval of the center’s degree program proposal at the May 2023 Institute faculty meeting.

    Doctoral-level graduate study in computational science and engineering (CSE) at MIT has, for the past decade, been offered through an interdisciplinary program in which CSE students are admitted to one of eight participating academic departments in the School of Engineering or School of Science. While this model adds a strong disciplinary component to students’ education, the rapid growth of the CSE field and the establishment of the MIT Schwarzman College of Computing have prompted an exciting expansion of MIT’s graduate-level offerings in computation.

    The new degree, offered by the college, will run alongside MIT’s existing interdisciplinary offerings in CSE, complementing these doctoral training programs and preparing students to contribute to the leading edge of the field. Here, CCSE co-directors Youssef Marzouk and Nicolas Hadjiconstantinou discuss the standalone program and how they expect it to elevate the visibility and impact of CSE research and education at MIT.

    Q: What is computational science and engineering?

    Marzouk: Computational science and engineering focuses on the development and analysis of state-of-the-art methods for computation and their innovative application to problems of science and engineering interest. It has intellectual foundations in applied mathematics, statistics, and computer science, and touches the full range of science and engineering disciplines. Yet, it synthesizes these foundations into a discipline of its own — one that links the digital and physical worlds. It’s an exciting and evolving multidisciplinary field.

    Hadjiconstantinou: Examples of CSE research happening at MIT include modeling and simulation techniques, the underlying computational mathematics, and data-driven modeling of physical systems. Computational statistics and scientific machine learning have become prominent threads within CSE, joining high-performance computing, mathematically-oriented programming languages, and their broader links to algorithms and software. Application domains include energy, environment and climate, materials, health, transportation, autonomy, and aerospace, among others. Some of our researchers focus on general and widely applicable methodology, while others choose to focus on methods and algorithms motivated by a specific domain of application.

    Q: What was the motivation behind creating a standalone PhD program?

    Marzouk: The new degree focuses on a particular class of students whose background and interests are primarily in CSE methodology, in a manner that cuts across the disciplinary research structure represented by our current “with-departments” degree program. There is a strong research demand for such methodologically-focused students among CCSE faculty and MIT faculty in general. Our objective is to create a targeted, coherent degree program in this field that, alongside our other thriving CSE offerings, will create the leading environment for top CSE students worldwide.

    Hadjiconstantinou: One of CCSE’s most important functions is to recruit exceptional students who are trained in and want to work in computational science and engineering. Experience with our CSE master’s program suggests that students with a strong background and interests in the discipline prefer to apply to a pure CSE program for their graduate studies. The standalone degree aims to bring these students to MIT and make them available to faculty across the Institute.

    Q: How will this impact computing education and research at MIT? 

    Hadjiconstantinou: We believe that offering a standalone PhD program in CSE alongside the existing “with-departments” programs will significantly strengthen MIT’s graduate programs in computing. In particular, it will strengthen the methodological core of CSE research and education at MIT, while continuing to support the disciplinary-flavored CSE work taking place in our participating departments, which include Aeronautics and Astronautics; Chemical Engineering; Civil and Environmental Engineering; Materials Science and Engineering; Mechanical Engineering; Nuclear Science and Engineering; Earth, Atmospheric and Planetary Sciences; and Mathematics. Together, these programs will create a stronger CSE student cohort and facilitate deeper exchanges between the college and other units at MIT.

    Marzouk: In a broader sense, the new program is designed to help realize one of the key opportunities presented by the college, which is to create a richer variety of graduate degrees in computation and to involve as many faculty and units in these educational endeavors as possible. The standalone CSE PhD will join other distinguished doctoral programs of the college — such as the Department of Electrical Engineering and Computer Science PhD; the Operations Research Center PhD; and the Interdisciplinary Doctoral Program in Statistics and the Social and Engineering Systems PhD within the Institute for Data, Systems, and Society — and grow in a way that is informed by them. The confluence of these academic programs, and natural synergies among them, will make MIT quite unique. More

  • in

    Meet the 2023-24 Accenture Fellows

    The MIT and Accenture Convergence Initiative for Industry and Technology has selected five new research fellows for 2023-24. Now in its third year, the initiative underscores the ways in which industry and research can collaborate to spur technological innovation.

    Through its partnership with the School of Engineering, Accenture provides five annual fellowships awarded to graduate students with the aim of generating powerful new insights on the convergence of business and technology with the potential to transform society. The 2023-24 fellows will conduct research in areas including artificial intelligence, sustainability, and robotics.

    The 2023-24 Accenture Fellows are:

    Yiyue Luo

    Yiyue Luo is a PhD candidate who is developing innovative integrations of tactile sensing and haptics, interactive sensing and AI, digital fabrication, and smart wearables. Her work takes advantage of recent advances in digital manufacturing and AI, and the convergence in advanced sensing and actuation mechanisms, scalable digital manufacturing, and emerging computational techniques, with the goal of creating novel sensing and actuation devices that revolutionize interactions between people and their environments. In past projects, Luo has developed tactile sensing apparel including socks, gloves, and vests, as well as a workflow for computationally designing and digitally fabricating soft textiles-based pneumatic actuators. With the support of an Accenture Fellowship, she will advance her work of combining sensing and actuating devices and explore the development of haptic devices that simulate tactile cues captured by tactile sensors. Her ultimate aim is to build a scalable, textile-based, closed-loop human-machine interface. Luo’s research holds exciting potential to advance ground-breaking applications for smart textiles, health care, artificial and virtual reality, human-machine interactions, and robotics.

    Zanele Munyikwa is a PhD candidate whose research explores foundation models, a class of models that forms the basis of transformative general-purpose technologies (GPTs) such as GPT4. An Accenture Fellowship will enable Munyikwa to conduct research aimed at illuminating the current and potential impact of foundation models (including large language models) on work and tasks common to “high-skilled” knowledge workers in industries such as marketing, legal services, and medicine, in which foundation models are expected to have significant economic and social impacts. A primary goal of her project is to observe the impact of AI augmentation on tasks like copywriting and long-form writing. A second aim is to explore two primary ways that foundation models are driving the convergence of creative and technological industries, namely: reducing the cost of content generation and enabling the development of tools and platforms for education and training. Munyikwa’s work has important implications for the use of foundation models in many fields, from health care and education to legal services, business, and technology.

    Michelle Vaccaro is a PhD candidate in social engineering systems whose research explores human-AI collaboration with the goals of developing a deeper understanding of AI-based technologies (including ChatGPT and DALL-E), evaluating their performance and evolution, and steering their development toward societally beneficial applications, like climate change mitigation. An Accenture Fellowship will support Vaccaro’s current work toward two key objectives: identifying synergies between humans and AI-based software to help design human-AI systems that address persistent problems better than existing approaches; and investigating applications of human-AI collaboration for forecasting technological change, specifically for renewable energy technologies. By integrating the historically distinct domains of AI, systems engineering, and cognitive science with a wide range of industries, technical fields, and social applications, Vaccaro’s work has the potential to advance individual and collective productivity and creativity in all these areas.

    Chonghuan Wang is a PhD candidate in computational science and engineering whose research employs statistical learning, econometrics theory, and experimental design to create efficient, reliable, and sustainable field experiments in various domains. In his current work, Wang is applying statistical learning techniques such as online learning and bandit theory to test the effectiveness of new treatments, vaccinations, and health care interventions. With the support of an Accenture Fellowship, he will design experiments with the specific aim of understanding the trade-off between the loss of a patient’s welfare and the accuracy of estimating the treatment effect. The results of this research could help to save lives and contain disease outbreaks during pandemics like Covid-19. The benefits of enhanced experiment design and the collection of high-quality data extend well beyond health care; for example, these tools could help businesses optimize user engagement, test pricing impacts, and increase the usage of platforms and services. Wang’s research holds exciting potential to harness statistical learning, econometrics theory, and experimental design in support of strong businesses and the greater social good.

    Aaron Michael West Jr. is a PhD candidate whose research seeks to enhance our knowledge of human motor control and robotics. His work aims to advance rehabilitation technologies and prosthetic devices, as well as improve robot dexterity. His previous work has yielded valuable insights into the human ability to extract information solely from visual displays. Specifically, he demonstrated humans’ ability to estimate stiffness based solely on the visual observation of motion. These insights could advance the development of software applications with the same capability (e.g., using machine learning methods applied to video data) and may enable roboticists to develop enhanced motion control such that a robot’s intention is perceivable by humans. An Accenture Fellowship will enable West to continue this work, as well as new investigations into the functionality of the human hand to aid in the design of a prosthetic hand that better replicates human dexterity. By advancing understandings of human bio- and neuro-mechanics, West’s work has the potential to support major advances in robotics and rehabilitation technologies, with profound impacts on human health and well-being. More

  • in

    Helping computer vision and language models understand what they see

    Powerful machine-learning algorithms known as vision and language models, which learn to match text with images, have shown remarkable results when asked to generate captions or summarize videos.

    While these models excel at identifying objects, they often struggle to understand concepts, like object attributes or the arrangement of items in a scene. For instance, a vision and language model might recognize the cup and table in an image, but fail to grasp that the cup is sitting on the table.

    Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere have demonstrated a new technique that utilizes computer-generated data to help vision and language models overcome this shortcoming.

    The researchers created a synthetic dataset of images that depict a wide range of scenarios, object arrangements, and human actions, coupled with detailed text descriptions. They used this annotated dataset to “fix” vision and language models so they can learn concepts more effectively. Their technique ensures these models can still make accurate predictions when they see real images.

    When they tested models on concept understanding, the researchers found that their technique boosted accuracy by up to 10 percent. This could improve systems that automatically caption videos or enhance models that provide natural language answers to questions about images, with applications in fields like e-commerce or health care.

    “With this work, we are going beyond nouns in the sense that we are going beyond just the names of objects to more of the semantic concept of an object and everything around it. Our idea was that, when a machine-learning model sees objects in many different arrangements, it will have a better idea of how arrangement matters in a scene,” says Khaled Shehada, a graduate student in the Department of Electrical Engineering and Computer Science and co-author of a paper on this technique.

    Shehada wrote the paper with lead author Paola Cascante-Bonilla, a computer science graduate student at Rice University; Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing, MIT director of the MIT-IBM Watson AI Lab, and a senior research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL); senior author Leonid Karlinsky, a research staff member in the MIT-IBM Watson AI Lab; and others at MIT, the MIT-IBM Watson AI Lab, Georgia Tech, Rice University, École des Ponts, Weizmann Institute of Science, and IBM Research. The paper will be presented at the International Conference on Computer Vision.

    Focusing on objects

    Vision and language models typically learn to identify objects in a scene, and can end up ignoring object attributes, such as color and size, or positional relationships, such as which object is on top of another object.

    This is due to the method with which these models are often trained, known as contrastive learning. This training method involves forcing a model to predict the correspondence between images and text. When comparing natural images, the objects in each scene tend to cause the most striking differences. (Perhaps one image shows a horse in a field while the second shows a sailboat on the water.)

    “Every image could be uniquely defined by the objects in the image. So, when you do contrastive learning, just focusing on the nouns and objects would solve the problem. Why would the model do anything differently?” says Karlinsky.

    The researchers sought to mitigate this problem by using synthetic data to fine-tune a vision and language model. The fine-tuning process involves tweaking a model that has already been trained to improve its performance on a specific task.

    They used a computer to automatically create synthetic videos with diverse 3D environments and objects, such as furniture and luggage, and added human avatars that interacted with the objects.

    Using individual frames of these videos, they generated nearly 800,000 photorealistic images, and then paired each with a detailed caption. The researchers developed a methodology for annotating every aspect of the image to capture object attributes, positional relationships, and human-object interactions clearly and consistently in dense captions.

    Because the researchers created the images, they could control the appearance and position of objects, as well as the gender, clothing, poses, and actions of the human avatars.

    “Synthetic data allows a lot of diversity. With real images, you might not have a lot of elephants in a room, but with synthetic data, you could actually have a pink elephant in a room with a human, if you want,” Cascante-Bonilla says.

    Synthetic data have other advantages, too. They are cheaper to generate than real data, yet the images are highly photorealistic. They also preserve privacy because no real humans are shown in the images. And, because data are produced automatically by a computer, they can be generated quickly in massive quantities.

    By using different camera viewpoints, or slightly changing the positions or attributes of objects, the researchers created a dataset with a far wider variety of scenarios than one would find in a natural dataset.

    Fine-tune, but don’t forget

    However, when one fine-tunes a model with synthetic data, there is a risk that model might “forget” what it learned when it was originally trained with real data.

    The researchers employed a few techniques to prevent this problem, such as adjusting the synthetic data so colors, lighting, and shadows more closely match those found in natural images. They also made adjustments to the model’s inner-workings after fine-tuning to further reduce any forgetfulness.

    Their synthetic dataset and fine-tuning strategy improved the ability of popular vision and language models to accurately recognize concepts by up to 10 percent. At the same time, the models did not forget what they had already learned.

    Now that they have shown how synthetic data can be used to solve this problem, the researchers want to identify ways to improve the visual quality and diversity of these data, as well as the underlying physics that makes synthetic scenes look realistic. In addition, they plan to test the limits of scalability, and investigate whether model improvement starts to plateau with larger and more diverse synthetic datasets.

    This research is funded, in part, by the U.S. Defense Advanced Research Projects Agency, the National Science Foundation, and the MIT-IBM Watson AI Lab. More

  • in

    M’Care and MIT students join forces to improve child health in Nigeria

    Through a collaboration between M’Care, a 2021 Health Security and Pandemics Solver team, and students from MIT, the landscape of child health care in Nigeria could undergo a transformative change, wherein the power of data is harnessed to improve child health outcomes in economically disadvantaged communities. 

    M’Care is a mobile application of Promane and Promade Limited, developed by Opeoluwa Ashimi, which gives community health workers in Nigeria real-time diagnostic and treatment support. The application also creates a dashboard that is available to government health officials to help identify disease trends and deploy timely interventions. As part of its work, M’Care is working to mitigate malnutrition by providing micronutrient powder, vitamin A, and zinc to children below the age of 5. To help deepen its impact, Ashimi decided to work with students in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) course 6.S897 (Machine Learning for Healthcare) — instructed by professors Peter Szolovits and Manolis Kellis — to leverage data in order to improve nutrient delivery to children across Nigeria. The collaboration also enabled students to see real-world applications for data analysis in the health care space.

    A meeting of minds: M’Care, MIT, and national health authorities

    “Our primary goal for collaborating with the ML for Health team was to spot the missing link in the continuum of care. With over 1 million cumulative consultations that qualify for a continuum of care evaluation, it was important to spot why patients could be lost to followup, prevent this, and ensure completion of care to successfully address the health needs of our patients,” says Ashimi, founder and CEO of M’Care.

    In May 2023, Ashimi attended a meeting that brought together key national stakeholders, including the representatives of the National Ministry of Health in Nigeria. This gathering served as a platform to discuss the profound impact of M’Care’s and ML for Health team’s collaboration — bolstered by data analysis provided on dosage regimens and a child’s age to enhance continuum of care with its attendant impact on children’s health, particularly in relation to brain development with regards to the use of essential micronutrients. The data analyzed by the students using ML methods that were shared during the meeting provided strong supporting evidence to individualize dosage regimens for children based on their age in months for the ANRIN project — a national nutrition project supported by the World Bank — as well as policy decisions to extend months of coverage for children, redefining health care practices in Nigeria.

    MIT students drive change by harnessing the power of data

    At the heart of this collaboration lies the contribution of MIT students. Armed with their dedication and skill in data analysis and machine learning, they played a pivotal role in helping M’Care analyze their data and prepare for their meeting with the Ministry of Health. Their most significant findings included ways to identify patients at risk of not completing their full course of micronutrient powder and/or vitamin A, and identifying gaps in M’Care’s data, such as postdated delivery dates and community demographics. These findings are already helping M’Care better plan its resources and adjust the scope of its program to ensure more children complete the intervention.

    Darcy Kim, an undergraduate at Wellesley College studying math and computer science, who is cross-registered for the MIT machine learning course, expresses enthusiasm about the practical applications found within the project: “To me, data and math is storytelling, and the story is why I love studying it. … I learned that data exploration involves asking questions about how the data is collected, and that surprising patterns that arise often have a qualitative explanation. Impactful research requires radical collaboration with the people the research intends to help. Otherwise, these qualitative explanations get lost in the numbers.”

    Joyce Luo, a first-year operations research PhD student at the Operations Research Center at MIT, shares similar thoughts about the project: “I learned the importance of understanding the context behind data to figure out what kind of analysis might be most impactful. This involves being in frequent contact with the company or organization who provides the data to learn as much as you can about how the data was collected and the people the analysis could help. Stepping back and looking at the bigger picture, rather than just focusing on accuracy or metrics, is extremely important.”

    Insights to implementation: A new era for micronutrient dosing

    As a direct result of M’Care’s collaboration with MIT, policymakers revamped the dosing scheme for essential micronutrient administration for children in Nigeria to prevent malnutrition. M’Care and MIT’s data analysis unearthed critical insights into the limited frequency of medical visits caused by late-age enrollment. 

    “One big takeaway for me was that the data analysis portion of the project — doing a deep dive into the data; understanding, analyzing, visualizing, and summarizing the data — can be just as important as building the machine learning models. M’Care shared our data analysis with the National Ministry of Health, and the insights from it drove them to change their dosing scheme and schedule for delivering micronutrient powder to young children. This really showed us the value of understanding and knowing your data before modeling,” shares Angela Lin, a second-year PhD student at the Operations Research Center.

    Armed with this knowledge, policymakers are eager to develop an optimized dosing scheme that caters to the unique needs of children in disadvantaged communities, ensuring maximum impact on their brain development and overall well-being.

    Siddharth Srivastava, M’Care’s corporate technology liaison, shares his gratitude for the MIT student’s input. “Collaborating with enthusiastic and driven students was both empowering and inspiring. Each of them brought unique perspectives and technical skills to the table. Their passion for applying machine learning to health care was evident in their unwavering dedication and proactive approach to problem-solving.”

    Forging a path to impact

    The collaboration between M’Care and MIT exemplifies the remarkable achievements that arise when academia, innovative problem-solvers, and policy authorities unite. By merging academic rigor with real-world expertise, this partnership has the potential to revolutionize child health care not only in Nigeria but also in similar contexts worldwide.

    “I believe applying innovative methods of machine learning, data gathering, instrumentation, and planning to real problems in the developing world can be highly effective for those countries and highly motivating for our students. I was happy to have such a project in our class portfolio this year and look forward to future opportunities,” says Peter Szolovits, professor of computer science and engineering at MIT.

    By harnessing the power of data, innovation, and collective expertise, this collaboration between M’Care and MIT has the potential to improve equitable child health care in Nigeria. “It has been so fulfilling to see how our team’s work has been able to create even the smallest positive impact in such a short period of time, and it has been amazing to work with a company like Promane and Promade Limited that is so knowledgeable and caring for the communities that they serve,” shares Elizabeth Whittier, a second-year PhD electrical engineering student at MIT. More

  • in

    Artificial intelligence for augmentation and productivity

    The MIT Stephen A. Schwarzman College of Computing has awarded seed grants to seven projects that are exploring how artificial intelligence and human-computer interaction can be leveraged to enhance modern work spaces to achieve better management and higher productivity.

    Funded by Andrew W. Houston ’05 and Dropbox Inc., the projects are intended to be interdisciplinary and bring together researchers from computing, social sciences, and management.

    The seed grants can enable the project teams to conduct research that leads to bigger endeavors in this rapidly evolving area, as well as build community around questions related to AI-augmented management.

    The seven selected projects and research leads include:

    “LLMex: Implementing Vannevar Bush’s Vision of the Memex Using Large Language Models,” led by Patti Maes of the Media Lab and David Karger of the Department of Electrical Engineering and Computer Science (EECS) and the Computer Science and Artificial Intelligence Laboratory (CSAIL). Inspired by Vannevar Bush’s Memex, this project proposes to design, implement, and test the concept of memory prosthetics using large language models (LLMs). The AI-based system will intelligently help an individual keep track of vast amounts of information, accelerate productivity, and reduce errors by automatically recording their work actions and meetings, supporting retrieval based on metadata and vague descriptions, and suggesting relevant, personalized information proactively based on the user’s current focus and context.

    “Using AI Agents to Simulate Social Scenarios,” led by John Horton of the MIT Sloan School of Management and Jacob Andreas of EECS and CSAIL. This project imagines the ability to easily simulate policies, organizational arrangements, and communication tools with AI agents before implementation. Tapping into the capabilities of modern LLMs to serve as a computational model of humans makes this vision of social simulation more realistic, and potentially more predictive.

    “Human Expertise in the Age of AI: Can We Have Our Cake and Eat it Too?” led by Manish Raghavan of MIT Sloan and EECS, and Devavrat Shah of EECS and the Laboratory for Information and Decision Systems. Progress in machine learning, AI, and in algorithmic decision aids has raised the prospect that algorithms may complement human decision-making in a wide variety of settings. Rather than replacing human professionals, this project sees a future where AI and algorithmic decision aids play a role that is complementary to human expertise.

    “Implementing Generative AI in U.S. Hospitals,” led by Julie Shah of the Department of Aeronautics and Astronautics and CSAIL, Retsef Levi of MIT Sloan and the Operations Research Center, Kate Kellog of MIT Sloan, and Ben Armstrong of the Industrial Performance Center. In recent years, studies have linked a rise in burnout from doctors and nurses in the United States with increased administrative burdens associated with electronic health records and other technologies. This project aims to develop a holistic framework to study how generative AI technologies can both increase productivity for organizations and improve job quality for workers in health care settings.

    “Generative AI Augmented Software Tools to Democratize Programming,” led by Harold Abelson of EECS and CSAIL, Cynthia Breazeal of the Media Lab, and Eric Klopfer of the Comparative Media Studies/Writing. Progress in generative AI over the past year is fomenting an upheaval in assumptions about future careers in software and deprecating the role of coding. This project will stimulate a similar transformation in computing education for those who have no prior technical training by creating a software tool that could eliminate much of the need for learners to deal with code when creating applications.

    “Acquiring Expertise and Societal Productivity in a World of Artificial Intelligence,” led by David Atkin and Martin Beraja of the Department of Economics, and Danielle Li of MIT Sloan. Generative AI is thought to augment the capabilities of workers performing cognitive tasks. This project seeks to better understand how the arrival of AI technologies may impact skill acquisition and productivity, and to explore complementary policy interventions that will allow society to maximize the gains from such technologies.

    “AI Augmented Onboarding and Support,” led by Tim Kraska of EECS and CSAIL, and Christoph Paus of the Department of Physics. While LLMs have made enormous leaps forward in recent years and are poised to fundamentally change the way students and professionals learn about new tools and systems, there is often a steep learning curve which people have to climb in order to make full use of the resource. To help mitigate the issue, this project proposes the development of new LLM-powered onboarding and support systems that will positively impact the way support teams operate and improve the user experience. More

  • in

    To improve solar and other clean energy tech, look beyond hardware

    To continue reducing the costs of solar energy and other clean energy technologies, scientists and engineers will likely need to focus, at least in part, on improving technology features that are not based on hardware, according to MIT researchers. They describe this finding and the mechanisms behind it today in Nature Energy.

    While the cost of installing a solar energy system has dropped by more than 99 percent since 1980, this new analysis shows that “soft technology” features, such as the codified permitting practices, supply chain management techniques, and system design processes that go into deploying a solar energy plant, contributed only 10 to 15 percent of total cost declines. Improvements to hardware features were responsible for the lion’s share.

    But because soft technology is increasingly dominating the total costs of installing solar energy systems, this trend threatens to slow future cost savings and hamper the global transition to clean energy, says the study’s senior author, Jessika Trancik, a professor in MIT’s Institute for Data, Systems, and Society (IDSS).

    Trancik’s co-authors include lead author Magdalena M. Klemun, a former IDSS graduate student and postdoc who is now an assistant professor at the Hong Kong University of Science and Technology; Goksin Kavlak, a former IDSS graduate student and postdoc who is now an associate at the Brattle Group; and James McNerney, a former IDSS postdoc and now senior research fellow at the Harvard Kennedy School.

    The team created a quantitative model to analyze the cost evolution of solar energy systems, which captures the contributions of both hardware technology features and soft technology features.

    The framework shows that soft technology hasn’t improved much over time — and that soft technology features contributed even less to overall cost declines than previously estimated.

    Their findings indicate that to reverse this trend and accelerate cost declines, engineers could look at making solar energy systems less reliant on soft technology to begin with, or they could tackle the problem directly by improving inefficient deployment processes.  

    “Really understanding where the efficiencies and inefficiencies are, and how to address those inefficiencies, is critical in supporting the clean energy transition. We are making huge investments of public dollars into this, and soft technology is going to be absolutely essential to making those funds count,” says Trancik.

    “However,” Klemun adds, “we haven’t been thinking about soft technology design as systematically as we have for hardware. That needs to change.”

    The hard truth about soft costs

    Researchers have observed that the so-called “soft costs” of building a solar power plant — the costs of designing and installing the plant — are becoming a much larger share of total costs. In fact, the share of soft costs now typically ranges from 35 to 64 percent.

    “We wanted to take a closer look at where these soft costs were coming from and why they weren’t coming down over time as quickly as the hardware costs,” Trancik says.

    In the past, scientists have modeled the change in solar energy costs by dividing total costs into additive components — hardware components and nonhardware components — and then tracking how these components changed over time.

    “But if you really want to understand where those rates of change are coming from, you need to go one level deeper to look at the technology features. Then things split out differently,” Trancik says.

    The researchers developed a quantitative approach that models the change in solar energy costs over time by assigning contributions to the individual technology features, including both hardware features and soft technology features.

    For instance, their framework would capture how much of the decline in system installation costs — a soft cost — is due to standardized practices of certified installers — a soft technology feature. It would also capture how that same soft cost is affected by increased photovoltaic module efficiency — a hardware technology feature.

    With this approach, the researchers saw that improvements in hardware had the greatest impacts on driving down soft costs in solar energy systems. For example, the efficiency of photovoltaic modules doubled between 1980 and 2017, reducing overall system costs by 17 percent. But about 40 percent of that overall decline could be attributed to reductions in soft costs tied to improved module efficiency.

    The framework shows that, while hardware technology features tend to improve many cost components, soft technology features affect only a few.

    “You can see this structural difference even before you collect data on how the technologies have changed over time. That’s why mapping out a technology’s network of cost dependencies is a useful first step to identify levers of change, for solar PV and for other technologies as well,” Klemun notes.  

    Static soft technology

    The researchers used their model to study several countries, since soft costs can vary widely around the world. For instance, solar energy soft costs in Germany are about 50 percent less than those in the U.S.

    The fact that hardware technology improvements are often shared globally led to dramatic declines in costs over the past few decades across locations, the analysis showed. Soft technology innovations typically aren’t shared across borders. Moreover, the team found that countries with better soft technology performance 20 years ago still have better performance today, while those with worse performance didn’t see much improvement.

    This country-by-country difference could be driven by regulation and permitting processes, cultural factors, or by market dynamics such as how firms interact with each other, Trancik says.

    “But not all soft technology variables are ones that you would want to change in a cost-reducing direction, like lower wages. So, there are other considerations, beyond just bringing the cost of the technology down, that we need to think about when interpreting these results,” she says.

    Their analysis points to two strategies for reducing soft costs. For one, scientists could focus on developing hardware improvements that make soft costs more dependent on hardware technology variables and less on soft technology variables, such as by creating simpler, more standardized equipment that could reduce on-site installation time.

    Or researchers could directly target soft technology features without changing hardware, perhaps by creating more efficient workflows for system installation or automated permitting platforms.

    “In practice, engineers will often pursue both approaches, but separating the two in a formal model makes it easier to target innovation efforts by leveraging specific relationships between technology characteristics and costs,” Klemun says.

    “Often, when we think about information processing, we are leaving out processes that still happen in a very low-tech way through people communicating with one another. But it is just as important to think about that as a technology as it is to design fancy software,” Trancik notes.

    In the future, she and her collaborators want to apply their quantitative model to study the soft costs related to other technologies, such as electrical vehicle charging and nuclear fission. They are also interested in better understanding the limits of soft technology improvement, and how one could design better soft technology from the outset.

    This research is funded by the U.S. Department of Energy Solar Energy Technologies Office. More

  • in

    How machine learning models can amplify inequities in medical diagnosis and treatment

    Prior to receiving a PhD in computer science from MIT in 2017, Marzyeh Ghassemi had already begun to wonder whether the use of AI techniques might enhance the biases that already existed in health care. She was one of the early researchers to take up this issue, and she’s been exploring it ever since. In a new paper, Ghassemi, now an assistant professor in MIT’s Department of Electrical Science and Engineering (EECS), and three collaborators based at the Computer Science and Artificial Intelligence Laboratory, have probed the roots of the disparities that can arise in machine learning, often causing models that perform well overall to falter when it comes to subgroups for which relatively few data have been collected and utilized in the training process. The paper — written by two MIT PhD students, Yuzhe Yang and Haoran Zhang, EECS computer scientist Dina Katabi (the Thuan and Nicole Pham Professor), and Ghassemi — was presented last month at the 40th International Conference on Machine Learning in Honolulu, Hawaii.

    In their analysis, the researchers focused on “subpopulation shifts” — differences in the way machine learning models perform for one subgroup as compared to another. “We want the models to be fair and work equally well for all groups, but instead we consistently observe the presence of shifts among different groups that can lead to inferior medical diagnosis and treatment,” says Yang, who along with Zhang are the two lead authors on the paper. The main point of their inquiry is to determine the kinds of subpopulation shifts that can occur and to uncover the mechanisms behind them so that, ultimately, more equitable models can be developed.

    The new paper “significantly advances our understanding” of the subpopulation shift phenomenon, claims Stanford University computer scientist Sanmi Koyejo. “This research contributes valuable insights for future advancements in machine learning models’ performance on underrepresented subgroups.”

    Camels and cattle

    The MIT group has identified four principal types of shifts — spurious correlations, attribute imbalance, class imbalance, and attribute generalization — which, according to Yang, “have never been put together into a coherent and unified framework. We’ve come up with a single equation that shows you where biases can come from.”

    Biases can, in fact, stem from what the researchers call the class, or from the attribute, or both. To pick a simple example, suppose the task assigned to the machine learning model is to sort images of objects — animals in this case — into two classes: cows and camels. Attributes are descriptors that don’t specifically relate to the class itself. It might turn out, for instance, that all the images used in the analysis show cows standing on grass and camels on sand — grass and sand serving as the attributes here. Given the data available to it, the machine could reach an erroneous conclusion — namely that cows can only be found on grass, not on sand, with the opposite being true for camels. Such a finding would be incorrect, however, giving rise to a spurious correlation, which, Yang explains, is a “special case” among subpopulation shifts — “one in which you have a bias in both the class and the attribute.”

    In a medical setting, one could rely on machine learning models to determine whether a person has pneumonia or not based on an examination of X-ray images. There would be two classes in this situation, one consisting of people who have the lung ailment, another for those who are infection-free. A relatively straightforward case would involve just two attributes: the people getting X-rayed are either female or male. If, in this particular dataset, there were 100 males diagnosed with pneumonia for every one female diagnosed with pneumonia, that could lead to an attribute imbalance, and the model would likely do a better job of correctly detecting pneumonia for a man than for a woman. Similarly, having 1,000 times more healthy (pneumonia-free) subjects than sick ones would lead to a class imbalance, with the model biased toward healthy cases. Attribute generalization is the last shift highlighted in the new study. If your sample contained 100 male patients with pneumonia and zero female subjects with the same illness, you still would like the model to be able to generalize and make predictions about female subjects even though there are no samples in the training data for females with pneumonia.

    The team then took 20 advanced algorithms, designed to carry out classification tasks, and tested them on a dozen datasets to see how they performed across different population groups. They reached some unexpected conclusions: By improving the “classifier,” which is the last layer of the neural network, they were able to reduce the occurrence of spurious correlations and class imbalance, but the other shifts were unaffected. Improvements to the “encoder,” one of the uppermost layers in the neural network, could reduce the problem of attribute imbalance. “However, no matter what we did to the encoder or classifier, we did not see any improvements in terms of attribute generalization,” Yang says, “and we don’t yet know how to address that.”

    Precisely accurate

    There is also the question of assessing how well your model actually works in terms of evenhandedness among different population groups. The metric normally used, called worst-group accuracy or WGA, is based on the assumption that if you can improve the accuracy — of, say, medical diagnosis — for the group that has the worst model performance, you would have improved the model as a whole. “The WGA is considered the gold standard in subpopulation evaluation,” the authors contend, but they made a surprising discovery: boosting worst-group accuracy results in a decrease in what they call “worst-case precision.” In medical decision-making of all sorts, one needs both accuracy — which speaks to the validity of the findings — and precision, which relates to the reliability of the methodology. “Precision and accuracy are both very important metrics in classification tasks, and that is especially true in medical diagnostics,” Yang explains. “You should never trade precision for accuracy. You always need to balance the two.”

    The MIT scientists are putting their theories into practice. In a study they’re conducting with a medical center, they’re looking at public datasets for tens of thousands of patients and hundreds of thousands of chest X-rays, trying to see whether it’s possible for machine learning models to work in an unbiased manner for all populations. That’s still far from the case, even though more awareness has been drawn to this problem, Yang says. “We are finding many disparities across different ages, gender, ethnicity, and intersectional groups.”

    He and his colleagues agree on the eventual goal, which is to achieve fairness in health care among all populations. But before we can reach that point, they maintain, we still need a better understanding of the sources of unfairness and how they permeate our current system. Reforming the system as a whole will not be easy, they acknowledge. In fact, the title of the paper they introduced at the Honolulu conference, “Change is Hard,” gives some indications as to the challenges that they and like-minded researchers face. More

  • in

    The tenured engineers of 2023

    In 2023, MIT granted tenure to nine faculty members across the School of Engineering. This year’s tenured engineers hold appointments in the departments of Biological Engineering, Civil and Environmental Engineering, Electrical Engineering and Computer Science (which reports jointly to the School of Engineering and MIT Schwarzman College of Computing), Materials Science and Engineering, and Mechanical Engineering, as well as the Institute for Medical Engineering and Science (IMES).

    “I am truly inspired by this remarkable group of talented faculty members,” says Anantha Chandrakasan, dean of the School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “The work they are doing, both in the lab and in the classroom, has made a tremendous impact at MIT and in the wider world. Their important research has applications in a diverse range of fields and industries. I am thrilled to congratulate them on the milestone of receiving tenure.”

    This year’s newly tenured engineering faculty include:

    Michael Birnbaum, Class of 1956 Career Development Professor, associate professor of biological engineering, and faculty member at the Koch Institute for Integrative Cancer Research at MIT, works on understanding and manipulating immune recognition in cancer and infections. By using a variety of techniques to study the antigen recognition of T cells, he and his team aim to develop the next generation of immunotherapies.  
    Tamara Broderick, associate professor of electrical engineering and computer science and member of the MIT Laboratory for Information and Decision Systems (LIDS) and the MIT Institute for Data, Systems, and Society (IDSS), works to provide fast and reliable quantification of uncertainty and robustness in modern data analysis procedures. Broderick and her research group develop data analysis tools with applications in fields, including genetics, economics, and assistive technology. 
    Tal Cohen, associate professor of civil and environmental engineering and mechanical engineering, uses nonlinear solid mechanics to understand how materials behave under extreme conditions. By studying material instabilities, extreme dynamic loading conditions, growth, and chemical coupling, Cohen and her team combine theoretical models and experiments to shape our understanding of the observed phenomena and apply those insights in the design and characterization of material systems. 
    Betar Gallant, Class of 1922 Career Development Professor and associate professor of mechanical engineering, develops advanced materials and chemistries for next-generation lithium-ion and lithium primary batteries and electrochemical carbon dioxide mitigation technologies. Her group’s work could lead to higher-energy and more sustainable batteries for electric vehicles, longer-lasting implantable medical devices, and new methods of carbon capture and conversion. 
    Rafael Jaramillo, Thomas Lord Career Development Professor and associate professor of materials science and engineering, studies the synthesis, properties, and applications of electronic materials, particularly chalcogenide compound semiconductors. His work has applications in microelectronics, integrated photonics, telecommunications, and photovoltaics. 
    Benedetto Marelli, associate professor of civil and environmental engineering, conducts research on the synthesis, assembly, and nanomanufacturing of structural biopolymers. He and his research team develop biomaterials for applications in agriculture, food security, and food safety. 
    Ellen Roche, Latham Family Career Development Professor, an associate professor of mechanical engineering, and a core faculty of IMES, designs and develops implantable, biomimetic therapeutic devices and soft robotics that mechanically assist and repair tissue, deliver therapies, and enable enhanced preclinical testing. Her devices have a wide range of applications in human health, including cardiovascular and respiratory disease. 
    Serguei Saavedra, associate professor of civil and environmental engineering, uses systems thinking, synthesis, and mathematical modeling to study the persistence of ecological systems under changing environments. His theoretical research is used to develop hypotheses and corroborate predictions of how ecological systems respond to climate change. 
    Justin Solomon, associate professor of electrical engineering and computer science and member of the MIT Computer Science and Artificial Intelligence Laboratory and MIT Center for Computational Science and Engineering, works at the intersection of geometry, large-scale optimization, computer graphics, and machine learning. His research has diverse applications in machine learning, computer graphics, and geometric data processing.  More