More stories

  • in

    From physics to generative AI: An AI model for advanced pattern generation

    Generative AI, which is currently riding a crest of popular discourse, promises a world where the simple transforms into the complex — where a simple distribution evolves into intricate patterns of images, sounds, or text, rendering the artificial startlingly real. 

    The realms of imagination no longer remain as mere abstractions, as researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have brought an innovative AI model to life. Their new technology integrates two seemingly unrelated physical laws that underpin the best-performing generative models to date: diffusion, which typically illustrates the random motion of elements, like heat permeating a room or a gas expanding into space, and Poisson Flow, which draws on the principles governing the activity of electric charges.

    This harmonious blend has resulted in superior performance in generating new images, outpacing existing state-of-the-art models. Since its inception, the “Poisson Flow Generative Model ++” (PFGM++) has found potential applications in various fields, from antibody and RNA sequence generation to audio production and graph generation.

    The model can generate complex patterns, like creating realistic images or mimicking real-world processes. PFGM++ builds off of PFGM, the team’s work from the prior year. PFGM takes inspiration from the means behind the mathematical equation known as the “Poisson” equation, and then applies it to the data the model tries to learn from. To do this, the team used a clever trick: They added an extra dimension to their model’s “space,” kind of like going from a 2D sketch to a 3D model. This extra dimension gives more room for maneuvering, places the data in a larger context, and helps one approach the data from all directions when generating new samples. 

    “PFGM++ is an example of the kinds of AI advances that can be driven through interdisciplinary collaborations between physicists and computer scientists,” says Jesse Thaler, theoretical particle physicist in MIT’s Laboratory for Nuclear Science’s Center for Theoretical Physics and director of the National Science Foundation’s AI Institute for Artificial Intelligence and Fundamental Interactions (NSF AI IAIFI), who was not involved in the work. “In recent years, AI-based generative models have yielded numerous eye-popping results, from photorealistic images to lucid streams of text. Remarkably, some of the most powerful generative models are grounded in time-tested concepts from physics, such as symmetries and thermodynamics. PFGM++ takes a century-old idea from fundamental physics — that there might be extra dimensions of space-time — and turns it into a powerful and robust tool to generate synthetic but realistic datasets. I’m thrilled to see the myriad of ways ‘physics intelligence’ is transforming the field of artificial intelligence.”

    The underlying mechanism of PFGM isn’t as complex as it might sound. The researchers compared the data points to tiny electric charges placed on a flat plane in a dimensionally expanded world. These charges produce an “electric field,” with the charges looking to move upwards along the field lines into an extra dimension and consequently forming a uniform distribution on a vast imaginary hemisphere. The generation process is like rewinding a videotape: starting with a uniformly distributed set of charges on the hemisphere and tracking their journey back to the flat plane along the electric lines, they align to match the original data distribution. This intriguing process allows the neural model to learn the electric field, and generate new data that mirrors the original. 

    The PFGM++ model extends the electric field in PFGM to an intricate, higher-dimensional framework. When you keep expanding these dimensions, something unexpected happens — the model starts resembling another important class of models, the diffusion models. This work is all about finding the right balance. The PFGM and diffusion models sit at opposite ends of a spectrum: one is robust but complex to handle, the other simpler but less sturdy. The PFGM++ model offers a sweet spot, striking a balance between robustness and ease of use. This innovation paves the way for more efficient image and pattern generation, marking a significant step forward in technology. Along with adjustable dimensions, the researchers proposed a new training method that enables more efficient learning of the electric field. 

    To bring this theory to life, the team resolved a pair of differential equations detailing these charges’ motion within the electric field. They evaluated the performance using the Frechet Inception Distance (FID) score, a widely accepted metric that assesses the quality of images generated by the model in comparison to the real ones. PFGM++ further showcases a higher resistance to errors and robustness toward the step size in the differential equations.

    Looking ahead, they aim to refine certain aspects of the model, particularly in systematic ways to identify the “sweet spot” value of D tailored for specific data, architectures, and tasks by analyzing the behavior of estimation errors of neural networks. They also plan to apply the PFGM++ to the modern large-scale text-to-image/text-to-video generation.

    “Diffusion models have become a critical driving force behind the revolution in generative AI,” says Yang Song, research scientist at OpenAI. “PFGM++ presents a powerful generalization of diffusion models, allowing users to generate higher-quality images by improving the robustness of image generation against perturbations and learning errors. Furthermore, PFGM++ uncovers a surprising connection between electrostatics and diffusion models, providing new theoretical insights into diffusion model research.”

    “Poisson Flow Generative Models do not only rely on an elegant physics-inspired formulation based on electrostatics, but they also offer state-of-the-art generative modeling performance in practice,” says NVIDIA Senior Research Scientist Karsten Kreis, who was not involved in the work. “They even outperform the popular diffusion models, which currently dominate the literature. This makes them a very powerful generative modeling tool, and I envision their application in diverse areas, ranging from digital content creation to generative drug discovery. More generally, I believe that the exploration of further physics-inspired generative modeling frameworks holds great promise for the future and that Poisson Flow Generative Models are only the beginning.”

    Authors on a paper about this work include three MIT graduate students: Yilun Xu of the Department of Electrical Engineering and Computer Science (EECS) and CSAIL, Ziming Liu of the Department of Physics and the NSF AI IAIFI, and Shangyuan Tong of EECS and CSAIL, as well as Google Senior Research Scientist Yonglong Tian PhD ’23. MIT professors Max Tegmark and Tommi Jaakkola advised the research.

    The team was supported by the MIT-DSTA Singapore collaboration, the MIT-IBM Grand Challenge project, National Science Foundation grants, The Casey and Family Foundation, the Foundational Questions Institute, the Rothberg Family Fund for Cognitive Science, and the ML for Pharmaceutical Discovery and Synthesis Consortium. Their work was presented at the International Conference on Machine Learning this summer. More

  • in

    3 Questions: A new PhD program from the Center for Computational Science and Engineering

    This fall, the Center for Computational Science and Engineering (CCSE), an academic unit in the MIT Schwarzman College of Computing, is introducing a new standalone PhD degree program that will enable students to pursue research in cross-cutting methodological aspects of computational science and engineering. The launch follows approval of the center’s degree program proposal at the May 2023 Institute faculty meeting.

    Doctoral-level graduate study in computational science and engineering (CSE) at MIT has, for the past decade, been offered through an interdisciplinary program in which CSE students are admitted to one of eight participating academic departments in the School of Engineering or School of Science. While this model adds a strong disciplinary component to students’ education, the rapid growth of the CSE field and the establishment of the MIT Schwarzman College of Computing have prompted an exciting expansion of MIT’s graduate-level offerings in computation.

    The new degree, offered by the college, will run alongside MIT’s existing interdisciplinary offerings in CSE, complementing these doctoral training programs and preparing students to contribute to the leading edge of the field. Here, CCSE co-directors Youssef Marzouk and Nicolas Hadjiconstantinou discuss the standalone program and how they expect it to elevate the visibility and impact of CSE research and education at MIT.

    Q: What is computational science and engineering?

    Marzouk: Computational science and engineering focuses on the development and analysis of state-of-the-art methods for computation and their innovative application to problems of science and engineering interest. It has intellectual foundations in applied mathematics, statistics, and computer science, and touches the full range of science and engineering disciplines. Yet, it synthesizes these foundations into a discipline of its own — one that links the digital and physical worlds. It’s an exciting and evolving multidisciplinary field.

    Hadjiconstantinou: Examples of CSE research happening at MIT include modeling and simulation techniques, the underlying computational mathematics, and data-driven modeling of physical systems. Computational statistics and scientific machine learning have become prominent threads within CSE, joining high-performance computing, mathematically-oriented programming languages, and their broader links to algorithms and software. Application domains include energy, environment and climate, materials, health, transportation, autonomy, and aerospace, among others. Some of our researchers focus on general and widely applicable methodology, while others choose to focus on methods and algorithms motivated by a specific domain of application.

    Q: What was the motivation behind creating a standalone PhD program?

    Marzouk: The new degree focuses on a particular class of students whose background and interests are primarily in CSE methodology, in a manner that cuts across the disciplinary research structure represented by our current “with-departments” degree program. There is a strong research demand for such methodologically-focused students among CCSE faculty and MIT faculty in general. Our objective is to create a targeted, coherent degree program in this field that, alongside our other thriving CSE offerings, will create the leading environment for top CSE students worldwide.

    Hadjiconstantinou: One of CCSE’s most important functions is to recruit exceptional students who are trained in and want to work in computational science and engineering. Experience with our CSE master’s program suggests that students with a strong background and interests in the discipline prefer to apply to a pure CSE program for their graduate studies. The standalone degree aims to bring these students to MIT and make them available to faculty across the Institute.

    Q: How will this impact computing education and research at MIT? 

    Hadjiconstantinou: We believe that offering a standalone PhD program in CSE alongside the existing “with-departments” programs will significantly strengthen MIT’s graduate programs in computing. In particular, it will strengthen the methodological core of CSE research and education at MIT, while continuing to support the disciplinary-flavored CSE work taking place in our participating departments, which include Aeronautics and Astronautics; Chemical Engineering; Civil and Environmental Engineering; Materials Science and Engineering; Mechanical Engineering; Nuclear Science and Engineering; Earth, Atmospheric and Planetary Sciences; and Mathematics. Together, these programs will create a stronger CSE student cohort and facilitate deeper exchanges between the college and other units at MIT.

    Marzouk: In a broader sense, the new program is designed to help realize one of the key opportunities presented by the college, which is to create a richer variety of graduate degrees in computation and to involve as many faculty and units in these educational endeavors as possible. The standalone CSE PhD will join other distinguished doctoral programs of the college — such as the Department of Electrical Engineering and Computer Science PhD; the Operations Research Center PhD; and the Interdisciplinary Doctoral Program in Statistics and the Social and Engineering Systems PhD within the Institute for Data, Systems, and Society — and grow in a way that is informed by them. The confluence of these academic programs, and natural synergies among them, will make MIT quite unique. More

  • in

    On the hunt for sustainable materials

    By the time she started high school, Avni Singhal had attended six different schools in a variety of settings, from a traditional public school to a self-paced program. The transitions opened her eyes to how widely educational environments can vary, and made her think about that impact on students.

    “Experiencing so many different types of educational systems exposed me to different ways of looking at things and how that shapes people’s worldviews,” says Singhal.

    Now a fourth-year PhD student in the Department of Materials Science and Engineering, Singhal is still thinking about increasing opportunities for her fellow students, while also pursuing her research. She devotes herself to both developing sustainable materials and improving the graduate experience in her department.

    She recently completed her two-year term as a student representative on the department’s graduate studies committee. In this role, she helped revamp the communication around the qualifying exams and introducing student input to the faculty search process.

    “It’s given me a lot of insight into how our department works,” says Singhal. “It’s a chance to get to know faculty, bring up issues that students experience, and work on changing things that we think could be improved.”

    At the same time, Singhal uses atomistic simulations to model material properties, with an eye toward sustainability. She is a part of the Learning Matter Lab, a group that merges data science tools with engineering and physics-based simulation to better design and understand materials. As part of a computational group, Singhal has worked on a range of projects in collaboration with other labs that are looking to combine computing with other disciplines. Some of this work is sponsored by the MIT Climate and Sustainability Consortium, which facilitates connections across MIT labs and industry.

    Joining the Learning Matter Lab was a step out of Singhal’s comfort zone. She arrived at MIT from the University of California at Berkeley with a joint degree in materials science and bioengineering, as well as a degree in electrical engineering and computer science.

    “I was generally interested in doing work on environment-related applications,” says Singhal. “I was pretty hesitant at first to switch entirely to computation because it’s a very different type of lifestyle of research than what I was doing before.”

    Singhal has taken the challenge in stride, contributing to projects including improving carbon capture molecules and developing new deconstructable, degradable plastics. Not only does Singhal have to understand the technical details of her own work, she also needs to understand the big picture and how to best wield the expertise of her collaborators.

    “When I came in, I was very wide-eyed, thinking computation can do everything because I had never done it before,” says Singhal. “It’s that curve where you know a little bit about something, and you think it can do everything. And then as you learn more, you learn where it can and can’t help us, where it can be valuable, and how to figure out in what part of a project it’s useful.”

    Singhal applies a similarly critical lens when thinking about graduate school as a whole. She notes that access to information and resources is often the main factor determining who enters selective educational programs, and that such access becomes increasingly limited at the graduate level.

    “I realized just how much applying is a function of knowing how to do it,” says Singhal, who co-organized and volunteers with the DMSE Application Assistance Program. The program matches prospective applicants with current students to give feedback on their application materials and provide insight into what it’s like attending MIT. Some of the first students Singhal mentored through the program are now also participants as well.

    “The further you get in your educational career, the more you realize how much assistance you got along the way to get where you are,” says Singhal. “That happens at every stage.”

    Looking toward the future, Singhal wants to continue to pursue research with a sustainability impact. She also wants to continue mentoring in some capacity but isn’t in a rush to figure out exactly what that will look like.

    “Grad school doesn’t mean I have to do one thing. I can stay open to all the possibilities of what comes next.”  More

  • in

    Meet the 2023-24 Accenture Fellows

    The MIT and Accenture Convergence Initiative for Industry and Technology has selected five new research fellows for 2023-24. Now in its third year, the initiative underscores the ways in which industry and research can collaborate to spur technological innovation.

    Through its partnership with the School of Engineering, Accenture provides five annual fellowships awarded to graduate students with the aim of generating powerful new insights on the convergence of business and technology with the potential to transform society. The 2023-24 fellows will conduct research in areas including artificial intelligence, sustainability, and robotics.

    The 2023-24 Accenture Fellows are:

    Yiyue Luo

    Yiyue Luo is a PhD candidate who is developing innovative integrations of tactile sensing and haptics, interactive sensing and AI, digital fabrication, and smart wearables. Her work takes advantage of recent advances in digital manufacturing and AI, and the convergence in advanced sensing and actuation mechanisms, scalable digital manufacturing, and emerging computational techniques, with the goal of creating novel sensing and actuation devices that revolutionize interactions between people and their environments. In past projects, Luo has developed tactile sensing apparel including socks, gloves, and vests, as well as a workflow for computationally designing and digitally fabricating soft textiles-based pneumatic actuators. With the support of an Accenture Fellowship, she will advance her work of combining sensing and actuating devices and explore the development of haptic devices that simulate tactile cues captured by tactile sensors. Her ultimate aim is to build a scalable, textile-based, closed-loop human-machine interface. Luo’s research holds exciting potential to advance ground-breaking applications for smart textiles, health care, artificial and virtual reality, human-machine interactions, and robotics.

    Zanele Munyikwa is a PhD candidate whose research explores foundation models, a class of models that forms the basis of transformative general-purpose technologies (GPTs) such as GPT4. An Accenture Fellowship will enable Munyikwa to conduct research aimed at illuminating the current and potential impact of foundation models (including large language models) on work and tasks common to “high-skilled” knowledge workers in industries such as marketing, legal services, and medicine, in which foundation models are expected to have significant economic and social impacts. A primary goal of her project is to observe the impact of AI augmentation on tasks like copywriting and long-form writing. A second aim is to explore two primary ways that foundation models are driving the convergence of creative and technological industries, namely: reducing the cost of content generation and enabling the development of tools and platforms for education and training. Munyikwa’s work has important implications for the use of foundation models in many fields, from health care and education to legal services, business, and technology.

    Michelle Vaccaro is a PhD candidate in social engineering systems whose research explores human-AI collaboration with the goals of developing a deeper understanding of AI-based technologies (including ChatGPT and DALL-E), evaluating their performance and evolution, and steering their development toward societally beneficial applications, like climate change mitigation. An Accenture Fellowship will support Vaccaro’s current work toward two key objectives: identifying synergies between humans and AI-based software to help design human-AI systems that address persistent problems better than existing approaches; and investigating applications of human-AI collaboration for forecasting technological change, specifically for renewable energy technologies. By integrating the historically distinct domains of AI, systems engineering, and cognitive science with a wide range of industries, technical fields, and social applications, Vaccaro’s work has the potential to advance individual and collective productivity and creativity in all these areas.

    Chonghuan Wang is a PhD candidate in computational science and engineering whose research employs statistical learning, econometrics theory, and experimental design to create efficient, reliable, and sustainable field experiments in various domains. In his current work, Wang is applying statistical learning techniques such as online learning and bandit theory to test the effectiveness of new treatments, vaccinations, and health care interventions. With the support of an Accenture Fellowship, he will design experiments with the specific aim of understanding the trade-off between the loss of a patient’s welfare and the accuracy of estimating the treatment effect. The results of this research could help to save lives and contain disease outbreaks during pandemics like Covid-19. The benefits of enhanced experiment design and the collection of high-quality data extend well beyond health care; for example, these tools could help businesses optimize user engagement, test pricing impacts, and increase the usage of platforms and services. Wang’s research holds exciting potential to harness statistical learning, econometrics theory, and experimental design in support of strong businesses and the greater social good.

    Aaron Michael West Jr. is a PhD candidate whose research seeks to enhance our knowledge of human motor control and robotics. His work aims to advance rehabilitation technologies and prosthetic devices, as well as improve robot dexterity. His previous work has yielded valuable insights into the human ability to extract information solely from visual displays. Specifically, he demonstrated humans’ ability to estimate stiffness based solely on the visual observation of motion. These insights could advance the development of software applications with the same capability (e.g., using machine learning methods applied to video data) and may enable roboticists to develop enhanced motion control such that a robot’s intention is perceivable by humans. An Accenture Fellowship will enable West to continue this work, as well as new investigations into the functionality of the human hand to aid in the design of a prosthetic hand that better replicates human dexterity. By advancing understandings of human bio- and neuro-mechanics, West’s work has the potential to support major advances in robotics and rehabilitation technologies, with profound impacts on human health and well-being. More

  • in

    How an archeological approach can help leverage biased data in AI to improve medicine

    The classic computer science adage “garbage in, garbage out” lacks nuance when it comes to understanding biased medical data, argue computer science and bioethics professors from MIT, Johns Hopkins University, and the Alan Turing Institute in a new opinion piece published in a recent edition of the New England Journal of Medicine (NEJM). The rising popularity of artificial intelligence has brought increased scrutiny to the matter of biased AI models resulting in algorithmic discrimination, which the White House Office of Science and Technology identified as a key issue in their recent Blueprint for an AI Bill of Rights. 

    When encountering biased data, particularly for AI models used in medical settings, the typical response is to either collect more data from underrepresented groups or generate synthetic data making up for missing parts to ensure that the model performs equally well across an array of patient populations. But the authors argue that this technical approach should be augmented with a sociotechnical perspective that takes both historical and current social factors into account. By doing so, researchers can be more effective in addressing bias in public health. 

    “The three of us had been discussing the ways in which we often treat issues with data from a machine learning perspective as irritations that need to be managed with a technical solution,” recalls co-author Marzyeh Ghassemi, an assistant professor in electrical engineering and computer science and an affiliate of the Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic), the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Institute of Medical Engineering and Science (IMES). “We had used analogies of data as an artifact that gives a partial view of past practices, or a cracked mirror holding up a reflection. In both cases the information is perhaps not entirely accurate or favorable: Maybe we think that we behave in certain ways as a society — but when you actually look at the data, it tells a different story. We might not like what that story is, but once you unearth an understanding of the past you can move forward and take steps to address poor practices.” 

    Data as artifact 

    In the paper, titled “Considering Biased Data as Informative Artifacts in AI-Assisted Health Care,” Ghassemi, Kadija Ferryman, and Maxine Mackintosh make the case for viewing biased clinical data as “artifacts” in the same way anthropologists or archeologists would view physical objects: pieces of civilization-revealing practices, belief systems, and cultural values — in the case of the paper, specifically those that have led to existing inequities in the health care system. 

    For example, a 2019 study showed that an algorithm widely considered to be an industry standard used health-care expenditures as an indicator of need, leading to the erroneous conclusion that sicker Black patients require the same level of care as healthier white patients. What researchers found was algorithmic discrimination failing to account for unequal access to care.  

    In this instance, rather than viewing biased datasets or lack of data as problems that only require disposal or fixing, Ghassemi and her colleagues recommend the “artifacts” approach as a way to raise awareness around social and historical elements influencing how data are collected and alternative approaches to clinical AI development. 

    “If the goal of your model is deployment in a clinical setting, you should engage a bioethicist or a clinician with appropriate training reasonably early on in problem formulation,” says Ghassemi. “As computer scientists, we often don’t have a complete picture of the different social and historical factors that have gone into creating data that we’ll be using. We need expertise in discerning when models generalized from existing data may not work well for specific subgroups.” 

    When more data can actually harm performance 

    The authors acknowledge that one of the more challenging aspects of implementing an artifact-based approach is being able to assess whether data have been racially corrected: i.e., using white, male bodies as the conventional standard that other bodies are measured against. The opinion piece cites an example from the Chronic Kidney Disease Collaboration in 2021, which developed a new equation to measure kidney function because the old equation had previously been “corrected” under the blanket assumption that Black people have higher muscle mass. Ghassemi says that researchers should be prepared to investigate race-based correction as part of the research process. 

    In another recent paper accepted to this year’s International Conference on Machine Learning co-authored by Ghassemi’s PhD student Vinith Suriyakumar and University of California at San Diego Assistant Professor Berk Ustun, the researchers found that assuming the inclusion of personalized attributes like self-reported race improve the performance of ML models can actually lead to worse risk scores, models, and metrics for minority and minoritized populations.  

    “There’s no single right solution for whether or not to include self-reported race in a clinical risk score. Self-reported race is a social construct that is both a proxy for other information, and deeply proxied itself in other medical data. The solution needs to fit the evidence,” explains Ghassemi. 

    How to move forward 

    This is not to say that biased datasets should be enshrined, or biased algorithms don’t require fixing — quality training data is still key to developing safe, high-performance clinical AI models, and the NEJM piece highlights the role of the National Institutes of Health (NIH) in driving ethical practices.  

    “Generating high-quality, ethically sourced datasets is crucial for enabling the use of next-generation AI technologies that transform how we do research,” NIH acting director Lawrence Tabak stated in a press release when the NIH announced its $130 million Bridge2AI Program last year. Ghassemi agrees, pointing out that the NIH has “prioritized data collection in ethical ways that cover information we have not previously emphasized the value of in human health — such as environmental factors and social determinants. I’m very excited about their prioritization of, and strong investments towards, achieving meaningful health outcomes.” 

    Elaine Nsoesie, an associate professor at the Boston University of Public Health, believes there are many potential benefits to treating biased datasets as artifacts rather than garbage, starting with the focus on context. “Biases present in a dataset collected for lung cancer patients in a hospital in Uganda might be different from a dataset collected in the U.S. for the same patient population,” she explains. “In considering local context, we can train algorithms to better serve specific populations.” Nsoesie says that understanding the historical and contemporary factors shaping a dataset can make it easier to identify discriminatory practices that might be coded in algorithms or systems in ways that are not immediately obvious. She also notes that an artifact-based approach could lead to the development of new policies and structures ensuring that the root causes of bias in a particular dataset are eliminated. 

    “People often tell me that they are very afraid of AI, especially in health. They’ll say, ‘I’m really scared of an AI misdiagnosing me,’ or ‘I’m concerned it will treat me poorly,’” Ghassemi says. “I tell them, you shouldn’t be scared of some hypothetical AI in health tomorrow, you should be scared of what health is right now. If we take a narrow technical view of the data we extract from systems, we could naively replicate poor practices. That’s not the only option — realizing there is a problem is our first step towards a larger opportunity.”  More

  • in

    Helping computer vision and language models understand what they see

    Powerful machine-learning algorithms known as vision and language models, which learn to match text with images, have shown remarkable results when asked to generate captions or summarize videos.

    While these models excel at identifying objects, they often struggle to understand concepts, like object attributes or the arrangement of items in a scene. For instance, a vision and language model might recognize the cup and table in an image, but fail to grasp that the cup is sitting on the table.

    Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere have demonstrated a new technique that utilizes computer-generated data to help vision and language models overcome this shortcoming.

    The researchers created a synthetic dataset of images that depict a wide range of scenarios, object arrangements, and human actions, coupled with detailed text descriptions. They used this annotated dataset to “fix” vision and language models so they can learn concepts more effectively. Their technique ensures these models can still make accurate predictions when they see real images.

    When they tested models on concept understanding, the researchers found that their technique boosted accuracy by up to 10 percent. This could improve systems that automatically caption videos or enhance models that provide natural language answers to questions about images, with applications in fields like e-commerce or health care.

    “With this work, we are going beyond nouns in the sense that we are going beyond just the names of objects to more of the semantic concept of an object and everything around it. Our idea was that, when a machine-learning model sees objects in many different arrangements, it will have a better idea of how arrangement matters in a scene,” says Khaled Shehada, a graduate student in the Department of Electrical Engineering and Computer Science and co-author of a paper on this technique.

    Shehada wrote the paper with lead author Paola Cascante-Bonilla, a computer science graduate student at Rice University; Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing, MIT director of the MIT-IBM Watson AI Lab, and a senior research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL); senior author Leonid Karlinsky, a research staff member in the MIT-IBM Watson AI Lab; and others at MIT, the MIT-IBM Watson AI Lab, Georgia Tech, Rice University, École des Ponts, Weizmann Institute of Science, and IBM Research. The paper will be presented at the International Conference on Computer Vision.

    Focusing on objects

    Vision and language models typically learn to identify objects in a scene, and can end up ignoring object attributes, such as color and size, or positional relationships, such as which object is on top of another object.

    This is due to the method with which these models are often trained, known as contrastive learning. This training method involves forcing a model to predict the correspondence between images and text. When comparing natural images, the objects in each scene tend to cause the most striking differences. (Perhaps one image shows a horse in a field while the second shows a sailboat on the water.)

    “Every image could be uniquely defined by the objects in the image. So, when you do contrastive learning, just focusing on the nouns and objects would solve the problem. Why would the model do anything differently?” says Karlinsky.

    The researchers sought to mitigate this problem by using synthetic data to fine-tune a vision and language model. The fine-tuning process involves tweaking a model that has already been trained to improve its performance on a specific task.

    They used a computer to automatically create synthetic videos with diverse 3D environments and objects, such as furniture and luggage, and added human avatars that interacted with the objects.

    Using individual frames of these videos, they generated nearly 800,000 photorealistic images, and then paired each with a detailed caption. The researchers developed a methodology for annotating every aspect of the image to capture object attributes, positional relationships, and human-object interactions clearly and consistently in dense captions.

    Because the researchers created the images, they could control the appearance and position of objects, as well as the gender, clothing, poses, and actions of the human avatars.

    “Synthetic data allows a lot of diversity. With real images, you might not have a lot of elephants in a room, but with synthetic data, you could actually have a pink elephant in a room with a human, if you want,” Cascante-Bonilla says.

    Synthetic data have other advantages, too. They are cheaper to generate than real data, yet the images are highly photorealistic. They also preserve privacy because no real humans are shown in the images. And, because data are produced automatically by a computer, they can be generated quickly in massive quantities.

    By using different camera viewpoints, or slightly changing the positions or attributes of objects, the researchers created a dataset with a far wider variety of scenarios than one would find in a natural dataset.

    Fine-tune, but don’t forget

    However, when one fine-tunes a model with synthetic data, there is a risk that model might “forget” what it learned when it was originally trained with real data.

    The researchers employed a few techniques to prevent this problem, such as adjusting the synthetic data so colors, lighting, and shadows more closely match those found in natural images. They also made adjustments to the model’s inner-workings after fine-tuning to further reduce any forgetfulness.

    Their synthetic dataset and fine-tuning strategy improved the ability of popular vision and language models to accurately recognize concepts by up to 10 percent. At the same time, the models did not forget what they had already learned.

    Now that they have shown how synthetic data can be used to solve this problem, the researchers want to identify ways to improve the visual quality and diversity of these data, as well as the underlying physics that makes synthetic scenes look realistic. In addition, they plan to test the limits of scalability, and investigate whether model improvement starts to plateau with larger and more diverse synthetic datasets.

    This research is funded, in part, by the U.S. Defense Advanced Research Projects Agency, the National Science Foundation, and the MIT-IBM Watson AI Lab. More

  • in

    Fast-tracking fusion energy’s arrival with AI and accessibility

    As the impacts of climate change continue to grow, so does interest in fusion’s potential as a clean energy source. While fusion reactions have been studied in laboratories since the 1930s, there are still many critical questions scientists must answer to make fusion power a reality, and time is of the essence. As part of their strategy to accelerate fusion energy’s arrival and reach carbon neutrality by 2050, the U.S. Department of Energy (DoE) has announced new funding for a project led by researchers at MIT’s Plasma Science and Fusion Center (PSFC) and four collaborating institutions.

    Cristina Rea, a research scientist and group leader at the PSFC, will serve as the primary investigator for the newly funded three-year collaboration to pilot the integration of fusion data into a system that can be read by AI-powered tools. The PSFC, together with scientists from William & Mary, the University of Wisconsin at Madison, Auburn University, and the nonprofit HDF Group, plan to create a holistic fusion data platform, the elements of which could offer unprecedented access for researchers, especially underrepresented students. The project aims to encourage diverse participation in fusion and data science, both in academia and the workforce, through outreach programs led by the group’s co-investigators, of whom four out of five are women. 

    The DoE’s award, part of a $29 million funding package for seven projects across 19 institutions, will support the group’s efforts to distribute data produced by fusion devices like the PSFC’s Alcator C-Mod, a donut-shaped “tokamak” that utilized powerful magnets to control and confine fusion reactions. Alcator C-Mod operated from 1991 to 2016 and its data are still being studied, thanks in part to the PSFC’s commitment to the free exchange of knowledge.

    Currently, there are nearly 50 public experimental magnetic confinement-type fusion devices; however, both historical and current data from these devices can be difficult to access. Some fusion databases require signing user agreements, and not all data are catalogued and organized the same way. Moreover, it can be difficult to leverage machine learning, a class of AI tools, for data analysis and to enable scientific discovery without time-consuming data reorganization. The result is fewer scientists working on fusion, greater barriers to discovery, and a bottleneck in harnessing AI to accelerate progress.

    The project’s proposed data platform addresses technical barriers by being FAIR — Findable, Interoperable, Accessible, Reusable — and by adhering to UNESCO’s Open Science (OS) recommendations to improve the transparency and inclusivity of science; all of the researchers’ deliverables will adhere to FAIR and OS principles, as required by the DoE. The platform’s databases will be built using MDSplusML, an upgraded version of the MDSplus open-source software developed by PSFC researchers in the 1980s to catalogue the results of Alcator C-Mod’s experiments. Today, nearly 40 fusion research institutes use MDSplus to store and provide external access to their fusion data. The release of MDSplusML aims to continue that legacy of open collaboration.

    The researchers intend to address barriers to participation for women and disadvantaged groups not only by improving general access to fusion data, but also through a subsidized summer school that will focus on topics at the intersection of fusion and machine learning, which will be held at William & Mary for the next three years.

    Of the importance of their research, Rea says, “This project is about responding to the fusion community’s needs and setting ourselves up for success. Scientific advancements in fusion are enabled via multidisciplinary collaboration and cross-pollination, so accessibility is absolutely essential. I think we all understand now that diverse communities have more diverse ideas, and they allow faster problem-solving.”

    The collaboration’s work also aligns with vital areas of research identified in the International Atomic Energy Agency’s “AI for Fusion” Coordinated Research Project (CRP). Rea was selected as the technical coordinator for the IAEA’s CRP emphasizing community engagement and knowledge access to accelerate fusion research and development. In a letter of support written for the group’s proposed project, the IAEA stated that, “the work [the researchers] will carry out […] will be beneficial not only to our CRP but also to the international fusion community in large.”

    PSFC Director and Hitachi America Professor of Engineering Dennis Whyte adds, “I am thrilled to see PSFC and our collaborators be at the forefront of applying new AI tools while simultaneously encouraging and enabling extraction of critical data from our experiments.”

    “Having the opportunity to lead such an important project is extremely meaningful, and I feel a responsibility to show that women are leaders in STEM,” says Rea. “We have an incredible team, strongly motivated to improve our fusion ecosystem and to contribute to making fusion energy a reality.” More

  • in

    Supporting sustainability, digital health, and the future of work

    The MIT and Accenture Convergence Initiative for Industry and Technology has selected three new research projects that will receive support from the initiative. The research projects aim to accelerate progress in meeting complex societal needs through new business convergence insights in technology and innovation.

    Established in MIT’s School of Engineering and now in its third year, the MIT and Accenture Convergence Initiative is furthering its mission to bring together technological experts from across business and academia to share insights and learn from one another. Recently, Thomas W. Malone, the Patrick J. McGovern (1959) Professor of Management, joined the initiative as its first-ever faculty lead. The research projects relate to three of the initiative’s key focus areas: sustainability, digital health, and the future of work.

    “The solutions these research teams are developing have the potential to have tremendous impact,” says Anantha Chandrakasan, dean of the School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “They embody the initiative’s focus on advancing data-driven research that addresses technology and industry convergence.”

    “The convergence of science and technology driven by advancements in generative AI, digital twins, quantum computing, and other technologies makes this an especially exciting time for Accenture and MIT to be undertaking this joint research,” says Kenneth Munie, senior managing director at Accenture Strategy, Life Sciences. “Our three new research projects focusing on sustainability, digital health, and the future of work have the potential to help guide and shape future innovations that will benefit the way we work and live.”

    The MIT and Accenture Convergence Initiative charter project researchers are described below.

    Accelerating the journey to net zero with industrial clusters

    Jessika Trancik is a professor at the Institute for Data, Systems, and Society (IDSS). Trancik’s research examines the dynamic costs, performance, and environmental impacts of energy systems to inform climate policy and accelerate beneficial and equitable technology innovation. Trancik’s project aims to identify how industrial clusters can enable companies to derive greater value from decarbonization, potentially making companies more willing to invest in the clean energy transition.

    To meet the ambitious climate goals that have been set by countries around the world, rising greenhouse gas emissions trends must be rapidly reversed. Industrial clusters — geographically co-located or otherwise-aligned groups of companies representing one or more industries — account for a significant portion of greenhouse gas emissions globally. With major energy consumers “clustered” in proximity, industrial clusters provide a potential platform to scale low-carbon solutions by enabling the aggregation of demand and the coordinated investment in physical energy supply infrastructure.

    In addition to Trancik, the research team working on this project will include Aliza Khurram, a postdoc in IDSS; Micah Ziegler, an IDSS research scientist; Melissa Stark, global energy transition services lead at Accenture; Laura Sanderfer, strategy consulting manager at Accenture; and Maria De Miguel, strategy senior analyst at Accenture.

    Eliminating childhood obesity

    Anette “Peko” Hosoi is the Neil and Jane Pappalardo Professor of Mechanical Engineering. A common theme in her work is the fundamental study of shape, kinematic, and rheological optimization of biological systems with applications to the emergent field of soft robotics. Her project will use both data from existing studies and synthetic data to create a return-on-investment (ROI) calculator for childhood obesity interventions so that companies can identify earlier returns on their investment beyond reduced health-care costs.

    Childhood obesity is too prevalent to be solved by a single company, industry, drug, application, or program. In addition to the physical and emotional impact on children, society bears a cost through excess health care spending, lost workforce productivity, poor school performance, and increased family trauma. Meaningful solutions require multiple organizations, representing different parts of society, working together with a common understanding of the problem, the economic benefits, and the return on investment. ROI is particularly difficult to defend for any single organization because investment and return can be separated by many years and involve asymmetric investments, returns, and allocation of risk. Hosoi’s project will consider the incentives for a particular entity to invest in programs in order to reduce childhood obesity.

    Hosoi will be joined by graduate students Pragya Neupane and Rachael Kha, both of IDSS, as well a team from Accenture that includes Kenneth Munie, senior managing director at Accenture Strategy, Life Sciences; Kaveh Safavi, senior managing director in Accenture Health Industry; and Elizabeth Naik, global health and public service research lead.

    Generating innovative organizational configurations and algorithms for dealing with the problem of post-pandemic employment

    Thomas Malone is the Patrick J. McGovern (1959) Professor of Management at the MIT Sloan School of Management and the founding director of the MIT Center for Collective Intelligence. His research focuses on how new organizations can be designed to take advantage of the possibilities provided by information technology. Malone will be joined in this project by John Horton, the Richard S. Leghorn (1939) Career Development Professor at the MIT Sloan School of Management, whose research focuses on the intersection of labor economics, market design, and information systems. Malone and Horton’s project will look to reshape the future of work with the help of lessons learned in the wake of the pandemic.

    The Covid-19 pandemic has been a major disrupter of work and employment, and it is not at all obvious how governments, businesses, and other organizations should manage the transition to a desirable state of employment as the pandemic recedes. Using natural language processing algorithms such as GPT-4, this project will look to identify new ways that companies can use AI to better match applicants to necessary jobs, create new types of jobs, assess skill training needed, and identify interventions to help include women and other groups whose employment was disproportionately affected by the pandemic.

    In addition to Malone and Horton, the research team will include Rob Laubacher, associate director and research scientist at the MIT Center for Collective Intelligence, and Kathleen Kennedy, executive director at the MIT Center for Collective Intelligence and senior director at MIT Horizon. The team will also include Nitu Nivedita, managing director of artificial intelligence at Accenture, and Thomas Hancock, data science senior manager at Accenture. More