More stories

  • in

    J-PAL North America and Results for America announce 18 collaborations with state and local governments

    J-PAL North America and Results for America have announced 18 new partnerships with state and local governments across the country through their Leveraging Evidence and Evaluation for Equitable Recovery (LEVER) programming, which launched in April of this year. 

    As state and local leaders leverage federal relief funding to invest in their communities, J-PAL North America and Results for America are providing in-depth support to agencies in using data, evaluation, and evidence to advance effective and equitable government programming for generations to come. The 18 new collaborators span the contiguous United States and represent a wide range of pressing and innovative uses of federal Covid-19 recovery funding.

    These partnerships are a key component of the LEVER program, run by J-PAL North America — a regional office of MIT’s Abdul Latif Jameel Poverty Action Lab (J-PAL) — and Results for America — a nonprofit organization that helps government agencies harness the power of evidence and data. Through 2024, LEVER will continue to provide a suite of resources, training, and evaluation design services to prepare state and local government agencies to rigorously evaluate their own programs and to harness existing evidence in developing programs and policies using federal recovery dollars.

    J-PAL North America is working with four leading government agencies following a call for proposals to the LEVER Evaluation Incubator in June. These agencies will work with J-PAL staff to design randomized evaluations to understand the causal impact of important programs that contribute to their jurisdictions’ recovery from Covid-19.

    Connecticut’s Medicaid office, operating out of the state’s Department of Social Services, is working to improve vaccine access and awareness among youth. “Connecticut Medicaid is thrilled to work with J-PAL North America. The technical expertise and training that we receive will expand our knowledge during ‘testing and learning’ interventions that improve the health of our members,” says Gui Woolston, the director of Medicaid and Division of Health Services. 

    Athens-Clarke County Unified Government is invested in evaluating programming for youth development and violence prevention implemented by the Boys and Girls Club of Athens. Their goal is “to measure and transparently communicate program impact,” explains Paige Seago, the data and outcomes coordinator for the American Rescue Plan Act. “The ability to continually iterate and tailor programs to better meet community goals is crucial to long-term success.”

    The County of San Diego’s newly formed Office of Evaluation, Performance, and Analytics is evaluating a pilot program providing rental subsidies for older adults. “Randomized evaluation can help us understand if rent subsidies will help prevent seniors from becoming homeless and will give us useful information about how to move forward,” says Chief Evaluation Officer Ricardo Basurto-Dávila. 

    In King County, Washington, the Executive Climate Office is planning to evaluate efforts to increase equitable access to household energy efficiency programs. “Because of J-PAL’s support, we have confidence that we can reduce climate impacts and extend home electrification benefits to lower-income homeowners in King County — homeowners who otherwise may not have the ability to participate in the clean energy transition,” says King County Climate Director Marissa Aho.

    Fourteen additional state and local agencies are working with Results for America as part of the LEVER Training Sprint. Together, they will develop policies that catalyze sustainable evidence building within government. 

    Jurisdictions selected for the Training Sprint represent government leaders at the city, county, and state levels — all of whom are committed to creating an evaluation framework for policy that will prioritize evidence-based decision-making across the country. Over the course of 10 weeks, with access to tools and coaching, each team will develop an internal implementation policy by embedding key evaluation and evidence practices into their jurisdiction’s decision-making processes. Participants will finish the Training Sprint with a robust decision-making framework that translates their LEVER implementation policies into actionable planning guidance. 

    Government leaders will utilize the LEVER Training Sprint to build a culture of data and evidence focused on leveraging evaluation policies to invest in delivering tangible results for their residents. About their participation in the LEVER Training Sprint, Dana Williams from Denver, Colorado says, “Impact evaluation is such an integral piece to understanding the past, present, and future. I’m excited to participate in the LEVER Training Sprint to better inform and drive evidence-based programming in Denver.”

    The Training Sprint is a part of a growing movement to ground government innovation in data and evidence. Kermina Hanna from the State of New Jersey notes, “It’s vital that we cement a data-driven commitment to equity in government operations, and I’m really excited for this opportunity to develop a national network of colleagues in government who share this passion and dedication to responsive public service.”

    Jurisdictions selected for the Training Sprint are: 

    Boston, Massachusetts;
    Carlsbad, California;
    Connecticut;
    Dallas, Texas;
    Denver City/County, Colorado;
    Fort Collins, Colorado;
    Guilford County, North Carolina;
    King County, Washington;
    Long Beach, California;
    Los Angeles, California;
    New Jersey;
    New Mexico;
    Pittsburgh, Pennsylvania; and
    Washington County, Oregon.
    Those interested in learning more can fill out the LEVER intake form. Please direct any questions about the Evaluation Incubator to Louise Geraghty and questions about the Training Sprint to Chelsea Powell. More

  • in

    From physics to generative AI: An AI model for advanced pattern generation

    Generative AI, which is currently riding a crest of popular discourse, promises a world where the simple transforms into the complex — where a simple distribution evolves into intricate patterns of images, sounds, or text, rendering the artificial startlingly real. 

    The realms of imagination no longer remain as mere abstractions, as researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have brought an innovative AI model to life. Their new technology integrates two seemingly unrelated physical laws that underpin the best-performing generative models to date: diffusion, which typically illustrates the random motion of elements, like heat permeating a room or a gas expanding into space, and Poisson Flow, which draws on the principles governing the activity of electric charges.

    This harmonious blend has resulted in superior performance in generating new images, outpacing existing state-of-the-art models. Since its inception, the “Poisson Flow Generative Model ++” (PFGM++) has found potential applications in various fields, from antibody and RNA sequence generation to audio production and graph generation.

    The model can generate complex patterns, like creating realistic images or mimicking real-world processes. PFGM++ builds off of PFGM, the team’s work from the prior year. PFGM takes inspiration from the means behind the mathematical equation known as the “Poisson” equation, and then applies it to the data the model tries to learn from. To do this, the team used a clever trick: They added an extra dimension to their model’s “space,” kind of like going from a 2D sketch to a 3D model. This extra dimension gives more room for maneuvering, places the data in a larger context, and helps one approach the data from all directions when generating new samples. 

    “PFGM++ is an example of the kinds of AI advances that can be driven through interdisciplinary collaborations between physicists and computer scientists,” says Jesse Thaler, theoretical particle physicist in MIT’s Laboratory for Nuclear Science’s Center for Theoretical Physics and director of the National Science Foundation’s AI Institute for Artificial Intelligence and Fundamental Interactions (NSF AI IAIFI), who was not involved in the work. “In recent years, AI-based generative models have yielded numerous eye-popping results, from photorealistic images to lucid streams of text. Remarkably, some of the most powerful generative models are grounded in time-tested concepts from physics, such as symmetries and thermodynamics. PFGM++ takes a century-old idea from fundamental physics — that there might be extra dimensions of space-time — and turns it into a powerful and robust tool to generate synthetic but realistic datasets. I’m thrilled to see the myriad of ways ‘physics intelligence’ is transforming the field of artificial intelligence.”

    The underlying mechanism of PFGM isn’t as complex as it might sound. The researchers compared the data points to tiny electric charges placed on a flat plane in a dimensionally expanded world. These charges produce an “electric field,” with the charges looking to move upwards along the field lines into an extra dimension and consequently forming a uniform distribution on a vast imaginary hemisphere. The generation process is like rewinding a videotape: starting with a uniformly distributed set of charges on the hemisphere and tracking their journey back to the flat plane along the electric lines, they align to match the original data distribution. This intriguing process allows the neural model to learn the electric field, and generate new data that mirrors the original. 

    The PFGM++ model extends the electric field in PFGM to an intricate, higher-dimensional framework. When you keep expanding these dimensions, something unexpected happens — the model starts resembling another important class of models, the diffusion models. This work is all about finding the right balance. The PFGM and diffusion models sit at opposite ends of a spectrum: one is robust but complex to handle, the other simpler but less sturdy. The PFGM++ model offers a sweet spot, striking a balance between robustness and ease of use. This innovation paves the way for more efficient image and pattern generation, marking a significant step forward in technology. Along with adjustable dimensions, the researchers proposed a new training method that enables more efficient learning of the electric field. 

    To bring this theory to life, the team resolved a pair of differential equations detailing these charges’ motion within the electric field. They evaluated the performance using the Frechet Inception Distance (FID) score, a widely accepted metric that assesses the quality of images generated by the model in comparison to the real ones. PFGM++ further showcases a higher resistance to errors and robustness toward the step size in the differential equations.

    Looking ahead, they aim to refine certain aspects of the model, particularly in systematic ways to identify the “sweet spot” value of D tailored for specific data, architectures, and tasks by analyzing the behavior of estimation errors of neural networks. They also plan to apply the PFGM++ to the modern large-scale text-to-image/text-to-video generation.

    “Diffusion models have become a critical driving force behind the revolution in generative AI,” says Yang Song, research scientist at OpenAI. “PFGM++ presents a powerful generalization of diffusion models, allowing users to generate higher-quality images by improving the robustness of image generation against perturbations and learning errors. Furthermore, PFGM++ uncovers a surprising connection between electrostatics and diffusion models, providing new theoretical insights into diffusion model research.”

    “Poisson Flow Generative Models do not only rely on an elegant physics-inspired formulation based on electrostatics, but they also offer state-of-the-art generative modeling performance in practice,” says NVIDIA Senior Research Scientist Karsten Kreis, who was not involved in the work. “They even outperform the popular diffusion models, which currently dominate the literature. This makes them a very powerful generative modeling tool, and I envision their application in diverse areas, ranging from digital content creation to generative drug discovery. More generally, I believe that the exploration of further physics-inspired generative modeling frameworks holds great promise for the future and that Poisson Flow Generative Models are only the beginning.”

    Authors on a paper about this work include three MIT graduate students: Yilun Xu of the Department of Electrical Engineering and Computer Science (EECS) and CSAIL, Ziming Liu of the Department of Physics and the NSF AI IAIFI, and Shangyuan Tong of EECS and CSAIL, as well as Google Senior Research Scientist Yonglong Tian PhD ’23. MIT professors Max Tegmark and Tommi Jaakkola advised the research.

    The team was supported by the MIT-DSTA Singapore collaboration, the MIT-IBM Grand Challenge project, National Science Foundation grants, The Casey and Family Foundation, the Foundational Questions Institute, the Rothberg Family Fund for Cognitive Science, and the ML for Pharmaceutical Discovery and Synthesis Consortium. Their work was presented at the International Conference on Machine Learning this summer. More

  • in

    On the hunt for sustainable materials

    By the time she started high school, Avni Singhal had attended six different schools in a variety of settings, from a traditional public school to a self-paced program. The transitions opened her eyes to how widely educational environments can vary, and made her think about that impact on students.

    “Experiencing so many different types of educational systems exposed me to different ways of looking at things and how that shapes people’s worldviews,” says Singhal.

    Now a fourth-year PhD student in the Department of Materials Science and Engineering, Singhal is still thinking about increasing opportunities for her fellow students, while also pursuing her research. She devotes herself to both developing sustainable materials and improving the graduate experience in her department.

    She recently completed her two-year term as a student representative on the department’s graduate studies committee. In this role, she helped revamp the communication around the qualifying exams and introducing student input to the faculty search process.

    “It’s given me a lot of insight into how our department works,” says Singhal. “It’s a chance to get to know faculty, bring up issues that students experience, and work on changing things that we think could be improved.”

    At the same time, Singhal uses atomistic simulations to model material properties, with an eye toward sustainability. She is a part of the Learning Matter Lab, a group that merges data science tools with engineering and physics-based simulation to better design and understand materials. As part of a computational group, Singhal has worked on a range of projects in collaboration with other labs that are looking to combine computing with other disciplines. Some of this work is sponsored by the MIT Climate and Sustainability Consortium, which facilitates connections across MIT labs and industry.

    Joining the Learning Matter Lab was a step out of Singhal’s comfort zone. She arrived at MIT from the University of California at Berkeley with a joint degree in materials science and bioengineering, as well as a degree in electrical engineering and computer science.

    “I was generally interested in doing work on environment-related applications,” says Singhal. “I was pretty hesitant at first to switch entirely to computation because it’s a very different type of lifestyle of research than what I was doing before.”

    Singhal has taken the challenge in stride, contributing to projects including improving carbon capture molecules and developing new deconstructable, degradable plastics. Not only does Singhal have to understand the technical details of her own work, she also needs to understand the big picture and how to best wield the expertise of her collaborators.

    “When I came in, I was very wide-eyed, thinking computation can do everything because I had never done it before,” says Singhal. “It’s that curve where you know a little bit about something, and you think it can do everything. And then as you learn more, you learn where it can and can’t help us, where it can be valuable, and how to figure out in what part of a project it’s useful.”

    Singhal applies a similarly critical lens when thinking about graduate school as a whole. She notes that access to information and resources is often the main factor determining who enters selective educational programs, and that such access becomes increasingly limited at the graduate level.

    “I realized just how much applying is a function of knowing how to do it,” says Singhal, who co-organized and volunteers with the DMSE Application Assistance Program. The program matches prospective applicants with current students to give feedback on their application materials and provide insight into what it’s like attending MIT. Some of the first students Singhal mentored through the program are now also participants as well.

    “The further you get in your educational career, the more you realize how much assistance you got along the way to get where you are,” says Singhal. “That happens at every stage.”

    Looking toward the future, Singhal wants to continue to pursue research with a sustainability impact. She also wants to continue mentoring in some capacity but isn’t in a rush to figure out exactly what that will look like.

    “Grad school doesn’t mean I have to do one thing. I can stay open to all the possibilities of what comes next.”  More

  • in

    Meet the 2023-24 Accenture Fellows

    The MIT and Accenture Convergence Initiative for Industry and Technology has selected five new research fellows for 2023-24. Now in its third year, the initiative underscores the ways in which industry and research can collaborate to spur technological innovation.

    Through its partnership with the School of Engineering, Accenture provides five annual fellowships awarded to graduate students with the aim of generating powerful new insights on the convergence of business and technology with the potential to transform society. The 2023-24 fellows will conduct research in areas including artificial intelligence, sustainability, and robotics.

    The 2023-24 Accenture Fellows are:

    Yiyue Luo

    Yiyue Luo is a PhD candidate who is developing innovative integrations of tactile sensing and haptics, interactive sensing and AI, digital fabrication, and smart wearables. Her work takes advantage of recent advances in digital manufacturing and AI, and the convergence in advanced sensing and actuation mechanisms, scalable digital manufacturing, and emerging computational techniques, with the goal of creating novel sensing and actuation devices that revolutionize interactions between people and their environments. In past projects, Luo has developed tactile sensing apparel including socks, gloves, and vests, as well as a workflow for computationally designing and digitally fabricating soft textiles-based pneumatic actuators. With the support of an Accenture Fellowship, she will advance her work of combining sensing and actuating devices and explore the development of haptic devices that simulate tactile cues captured by tactile sensors. Her ultimate aim is to build a scalable, textile-based, closed-loop human-machine interface. Luo’s research holds exciting potential to advance ground-breaking applications for smart textiles, health care, artificial and virtual reality, human-machine interactions, and robotics.

    Zanele Munyikwa is a PhD candidate whose research explores foundation models, a class of models that forms the basis of transformative general-purpose technologies (GPTs) such as GPT4. An Accenture Fellowship will enable Munyikwa to conduct research aimed at illuminating the current and potential impact of foundation models (including large language models) on work and tasks common to “high-skilled” knowledge workers in industries such as marketing, legal services, and medicine, in which foundation models are expected to have significant economic and social impacts. A primary goal of her project is to observe the impact of AI augmentation on tasks like copywriting and long-form writing. A second aim is to explore two primary ways that foundation models are driving the convergence of creative and technological industries, namely: reducing the cost of content generation and enabling the development of tools and platforms for education and training. Munyikwa’s work has important implications for the use of foundation models in many fields, from health care and education to legal services, business, and technology.

    Michelle Vaccaro is a PhD candidate in social engineering systems whose research explores human-AI collaboration with the goals of developing a deeper understanding of AI-based technologies (including ChatGPT and DALL-E), evaluating their performance and evolution, and steering their development toward societally beneficial applications, like climate change mitigation. An Accenture Fellowship will support Vaccaro’s current work toward two key objectives: identifying synergies between humans and AI-based software to help design human-AI systems that address persistent problems better than existing approaches; and investigating applications of human-AI collaboration for forecasting technological change, specifically for renewable energy technologies. By integrating the historically distinct domains of AI, systems engineering, and cognitive science with a wide range of industries, technical fields, and social applications, Vaccaro’s work has the potential to advance individual and collective productivity and creativity in all these areas.

    Chonghuan Wang is a PhD candidate in computational science and engineering whose research employs statistical learning, econometrics theory, and experimental design to create efficient, reliable, and sustainable field experiments in various domains. In his current work, Wang is applying statistical learning techniques such as online learning and bandit theory to test the effectiveness of new treatments, vaccinations, and health care interventions. With the support of an Accenture Fellowship, he will design experiments with the specific aim of understanding the trade-off between the loss of a patient’s welfare and the accuracy of estimating the treatment effect. The results of this research could help to save lives and contain disease outbreaks during pandemics like Covid-19. The benefits of enhanced experiment design and the collection of high-quality data extend well beyond health care; for example, these tools could help businesses optimize user engagement, test pricing impacts, and increase the usage of platforms and services. Wang’s research holds exciting potential to harness statistical learning, econometrics theory, and experimental design in support of strong businesses and the greater social good.

    Aaron Michael West Jr. is a PhD candidate whose research seeks to enhance our knowledge of human motor control and robotics. His work aims to advance rehabilitation technologies and prosthetic devices, as well as improve robot dexterity. His previous work has yielded valuable insights into the human ability to extract information solely from visual displays. Specifically, he demonstrated humans’ ability to estimate stiffness based solely on the visual observation of motion. These insights could advance the development of software applications with the same capability (e.g., using machine learning methods applied to video data) and may enable roboticists to develop enhanced motion control such that a robot’s intention is perceivable by humans. An Accenture Fellowship will enable West to continue this work, as well as new investigations into the functionality of the human hand to aid in the design of a prosthetic hand that better replicates human dexterity. By advancing understandings of human bio- and neuro-mechanics, West’s work has the potential to support major advances in robotics and rehabilitation technologies, with profound impacts on human health and well-being. More

  • in

    How an archeological approach can help leverage biased data in AI to improve medicine

    The classic computer science adage “garbage in, garbage out” lacks nuance when it comes to understanding biased medical data, argue computer science and bioethics professors from MIT, Johns Hopkins University, and the Alan Turing Institute in a new opinion piece published in a recent edition of the New England Journal of Medicine (NEJM). The rising popularity of artificial intelligence has brought increased scrutiny to the matter of biased AI models resulting in algorithmic discrimination, which the White House Office of Science and Technology identified as a key issue in their recent Blueprint for an AI Bill of Rights. 

    When encountering biased data, particularly for AI models used in medical settings, the typical response is to either collect more data from underrepresented groups or generate synthetic data making up for missing parts to ensure that the model performs equally well across an array of patient populations. But the authors argue that this technical approach should be augmented with a sociotechnical perspective that takes both historical and current social factors into account. By doing so, researchers can be more effective in addressing bias in public health. 

    “The three of us had been discussing the ways in which we often treat issues with data from a machine learning perspective as irritations that need to be managed with a technical solution,” recalls co-author Marzyeh Ghassemi, an assistant professor in electrical engineering and computer science and an affiliate of the Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic), the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Institute of Medical Engineering and Science (IMES). “We had used analogies of data as an artifact that gives a partial view of past practices, or a cracked mirror holding up a reflection. In both cases the information is perhaps not entirely accurate or favorable: Maybe we think that we behave in certain ways as a society — but when you actually look at the data, it tells a different story. We might not like what that story is, but once you unearth an understanding of the past you can move forward and take steps to address poor practices.” 

    Data as artifact 

    In the paper, titled “Considering Biased Data as Informative Artifacts in AI-Assisted Health Care,” Ghassemi, Kadija Ferryman, and Maxine Mackintosh make the case for viewing biased clinical data as “artifacts” in the same way anthropologists or archeologists would view physical objects: pieces of civilization-revealing practices, belief systems, and cultural values — in the case of the paper, specifically those that have led to existing inequities in the health care system. 

    For example, a 2019 study showed that an algorithm widely considered to be an industry standard used health-care expenditures as an indicator of need, leading to the erroneous conclusion that sicker Black patients require the same level of care as healthier white patients. What researchers found was algorithmic discrimination failing to account for unequal access to care.  

    In this instance, rather than viewing biased datasets or lack of data as problems that only require disposal or fixing, Ghassemi and her colleagues recommend the “artifacts” approach as a way to raise awareness around social and historical elements influencing how data are collected and alternative approaches to clinical AI development. 

    “If the goal of your model is deployment in a clinical setting, you should engage a bioethicist or a clinician with appropriate training reasonably early on in problem formulation,” says Ghassemi. “As computer scientists, we often don’t have a complete picture of the different social and historical factors that have gone into creating data that we’ll be using. We need expertise in discerning when models generalized from existing data may not work well for specific subgroups.” 

    When more data can actually harm performance 

    The authors acknowledge that one of the more challenging aspects of implementing an artifact-based approach is being able to assess whether data have been racially corrected: i.e., using white, male bodies as the conventional standard that other bodies are measured against. The opinion piece cites an example from the Chronic Kidney Disease Collaboration in 2021, which developed a new equation to measure kidney function because the old equation had previously been “corrected” under the blanket assumption that Black people have higher muscle mass. Ghassemi says that researchers should be prepared to investigate race-based correction as part of the research process. 

    In another recent paper accepted to this year’s International Conference on Machine Learning co-authored by Ghassemi’s PhD student Vinith Suriyakumar and University of California at San Diego Assistant Professor Berk Ustun, the researchers found that assuming the inclusion of personalized attributes like self-reported race improve the performance of ML models can actually lead to worse risk scores, models, and metrics for minority and minoritized populations.  

    “There’s no single right solution for whether or not to include self-reported race in a clinical risk score. Self-reported race is a social construct that is both a proxy for other information, and deeply proxied itself in other medical data. The solution needs to fit the evidence,” explains Ghassemi. 

    How to move forward 

    This is not to say that biased datasets should be enshrined, or biased algorithms don’t require fixing — quality training data is still key to developing safe, high-performance clinical AI models, and the NEJM piece highlights the role of the National Institutes of Health (NIH) in driving ethical practices.  

    “Generating high-quality, ethically sourced datasets is crucial for enabling the use of next-generation AI technologies that transform how we do research,” NIH acting director Lawrence Tabak stated in a press release when the NIH announced its $130 million Bridge2AI Program last year. Ghassemi agrees, pointing out that the NIH has “prioritized data collection in ethical ways that cover information we have not previously emphasized the value of in human health — such as environmental factors and social determinants. I’m very excited about their prioritization of, and strong investments towards, achieving meaningful health outcomes.” 

    Elaine Nsoesie, an associate professor at the Boston University of Public Health, believes there are many potential benefits to treating biased datasets as artifacts rather than garbage, starting with the focus on context. “Biases present in a dataset collected for lung cancer patients in a hospital in Uganda might be different from a dataset collected in the U.S. for the same patient population,” she explains. “In considering local context, we can train algorithms to better serve specific populations.” Nsoesie says that understanding the historical and contemporary factors shaping a dataset can make it easier to identify discriminatory practices that might be coded in algorithms or systems in ways that are not immediately obvious. She also notes that an artifact-based approach could lead to the development of new policies and structures ensuring that the root causes of bias in a particular dataset are eliminated. 

    “People often tell me that they are very afraid of AI, especially in health. They’ll say, ‘I’m really scared of an AI misdiagnosing me,’ or ‘I’m concerned it will treat me poorly,’” Ghassemi says. “I tell them, you shouldn’t be scared of some hypothetical AI in health tomorrow, you should be scared of what health is right now. If we take a narrow technical view of the data we extract from systems, we could naively replicate poor practices. That’s not the only option — realizing there is a problem is our first step towards a larger opportunity.”  More

  • in

    Helping computer vision and language models understand what they see

    Powerful machine-learning algorithms known as vision and language models, which learn to match text with images, have shown remarkable results when asked to generate captions or summarize videos.

    While these models excel at identifying objects, they often struggle to understand concepts, like object attributes or the arrangement of items in a scene. For instance, a vision and language model might recognize the cup and table in an image, but fail to grasp that the cup is sitting on the table.

    Researchers from MIT, the MIT-IBM Watson AI Lab, and elsewhere have demonstrated a new technique that utilizes computer-generated data to help vision and language models overcome this shortcoming.

    The researchers created a synthetic dataset of images that depict a wide range of scenarios, object arrangements, and human actions, coupled with detailed text descriptions. They used this annotated dataset to “fix” vision and language models so they can learn concepts more effectively. Their technique ensures these models can still make accurate predictions when they see real images.

    When they tested models on concept understanding, the researchers found that their technique boosted accuracy by up to 10 percent. This could improve systems that automatically caption videos or enhance models that provide natural language answers to questions about images, with applications in fields like e-commerce or health care.

    “With this work, we are going beyond nouns in the sense that we are going beyond just the names of objects to more of the semantic concept of an object and everything around it. Our idea was that, when a machine-learning model sees objects in many different arrangements, it will have a better idea of how arrangement matters in a scene,” says Khaled Shehada, a graduate student in the Department of Electrical Engineering and Computer Science and co-author of a paper on this technique.

    Shehada wrote the paper with lead author Paola Cascante-Bonilla, a computer science graduate student at Rice University; Aude Oliva, director of strategic industry engagement at the MIT Schwarzman College of Computing, MIT director of the MIT-IBM Watson AI Lab, and a senior research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL); senior author Leonid Karlinsky, a research staff member in the MIT-IBM Watson AI Lab; and others at MIT, the MIT-IBM Watson AI Lab, Georgia Tech, Rice University, École des Ponts, Weizmann Institute of Science, and IBM Research. The paper will be presented at the International Conference on Computer Vision.

    Focusing on objects

    Vision and language models typically learn to identify objects in a scene, and can end up ignoring object attributes, such as color and size, or positional relationships, such as which object is on top of another object.

    This is due to the method with which these models are often trained, known as contrastive learning. This training method involves forcing a model to predict the correspondence between images and text. When comparing natural images, the objects in each scene tend to cause the most striking differences. (Perhaps one image shows a horse in a field while the second shows a sailboat on the water.)

    “Every image could be uniquely defined by the objects in the image. So, when you do contrastive learning, just focusing on the nouns and objects would solve the problem. Why would the model do anything differently?” says Karlinsky.

    The researchers sought to mitigate this problem by using synthetic data to fine-tune a vision and language model. The fine-tuning process involves tweaking a model that has already been trained to improve its performance on a specific task.

    They used a computer to automatically create synthetic videos with diverse 3D environments and objects, such as furniture and luggage, and added human avatars that interacted with the objects.

    Using individual frames of these videos, they generated nearly 800,000 photorealistic images, and then paired each with a detailed caption. The researchers developed a methodology for annotating every aspect of the image to capture object attributes, positional relationships, and human-object interactions clearly and consistently in dense captions.

    Because the researchers created the images, they could control the appearance and position of objects, as well as the gender, clothing, poses, and actions of the human avatars.

    “Synthetic data allows a lot of diversity. With real images, you might not have a lot of elephants in a room, but with synthetic data, you could actually have a pink elephant in a room with a human, if you want,” Cascante-Bonilla says.

    Synthetic data have other advantages, too. They are cheaper to generate than real data, yet the images are highly photorealistic. They also preserve privacy because no real humans are shown in the images. And, because data are produced automatically by a computer, they can be generated quickly in massive quantities.

    By using different camera viewpoints, or slightly changing the positions or attributes of objects, the researchers created a dataset with a far wider variety of scenarios than one would find in a natural dataset.

    Fine-tune, but don’t forget

    However, when one fine-tunes a model with synthetic data, there is a risk that model might “forget” what it learned when it was originally trained with real data.

    The researchers employed a few techniques to prevent this problem, such as adjusting the synthetic data so colors, lighting, and shadows more closely match those found in natural images. They also made adjustments to the model’s inner-workings after fine-tuning to further reduce any forgetfulness.

    Their synthetic dataset and fine-tuning strategy improved the ability of popular vision and language models to accurately recognize concepts by up to 10 percent. At the same time, the models did not forget what they had already learned.

    Now that they have shown how synthetic data can be used to solve this problem, the researchers want to identify ways to improve the visual quality and diversity of these data, as well as the underlying physics that makes synthetic scenes look realistic. In addition, they plan to test the limits of scalability, and investigate whether model improvement starts to plateau with larger and more diverse synthetic datasets.

    This research is funded, in part, by the U.S. Defense Advanced Research Projects Agency, the National Science Foundation, and the MIT-IBM Watson AI Lab. More

  • in

    Fast-tracking fusion energy’s arrival with AI and accessibility

    As the impacts of climate change continue to grow, so does interest in fusion’s potential as a clean energy source. While fusion reactions have been studied in laboratories since the 1930s, there are still many critical questions scientists must answer to make fusion power a reality, and time is of the essence. As part of their strategy to accelerate fusion energy’s arrival and reach carbon neutrality by 2050, the U.S. Department of Energy (DoE) has announced new funding for a project led by researchers at MIT’s Plasma Science and Fusion Center (PSFC) and four collaborating institutions.

    Cristina Rea, a research scientist and group leader at the PSFC, will serve as the primary investigator for the newly funded three-year collaboration to pilot the integration of fusion data into a system that can be read by AI-powered tools. The PSFC, together with scientists from William & Mary, the University of Wisconsin at Madison, Auburn University, and the nonprofit HDF Group, plan to create a holistic fusion data platform, the elements of which could offer unprecedented access for researchers, especially underrepresented students. The project aims to encourage diverse participation in fusion and data science, both in academia and the workforce, through outreach programs led by the group’s co-investigators, of whom four out of five are women. 

    The DoE’s award, part of a $29 million funding package for seven projects across 19 institutions, will support the group’s efforts to distribute data produced by fusion devices like the PSFC’s Alcator C-Mod, a donut-shaped “tokamak” that utilized powerful magnets to control and confine fusion reactions. Alcator C-Mod operated from 1991 to 2016 and its data are still being studied, thanks in part to the PSFC’s commitment to the free exchange of knowledge.

    Currently, there are nearly 50 public experimental magnetic confinement-type fusion devices; however, both historical and current data from these devices can be difficult to access. Some fusion databases require signing user agreements, and not all data are catalogued and organized the same way. Moreover, it can be difficult to leverage machine learning, a class of AI tools, for data analysis and to enable scientific discovery without time-consuming data reorganization. The result is fewer scientists working on fusion, greater barriers to discovery, and a bottleneck in harnessing AI to accelerate progress.

    The project’s proposed data platform addresses technical barriers by being FAIR — Findable, Interoperable, Accessible, Reusable — and by adhering to UNESCO’s Open Science (OS) recommendations to improve the transparency and inclusivity of science; all of the researchers’ deliverables will adhere to FAIR and OS principles, as required by the DoE. The platform’s databases will be built using MDSplusML, an upgraded version of the MDSplus open-source software developed by PSFC researchers in the 1980s to catalogue the results of Alcator C-Mod’s experiments. Today, nearly 40 fusion research institutes use MDSplus to store and provide external access to their fusion data. The release of MDSplusML aims to continue that legacy of open collaboration.

    The researchers intend to address barriers to participation for women and disadvantaged groups not only by improving general access to fusion data, but also through a subsidized summer school that will focus on topics at the intersection of fusion and machine learning, which will be held at William & Mary for the next three years.

    Of the importance of their research, Rea says, “This project is about responding to the fusion community’s needs and setting ourselves up for success. Scientific advancements in fusion are enabled via multidisciplinary collaboration and cross-pollination, so accessibility is absolutely essential. I think we all understand now that diverse communities have more diverse ideas, and they allow faster problem-solving.”

    The collaboration’s work also aligns with vital areas of research identified in the International Atomic Energy Agency’s “AI for Fusion” Coordinated Research Project (CRP). Rea was selected as the technical coordinator for the IAEA’s CRP emphasizing community engagement and knowledge access to accelerate fusion research and development. In a letter of support written for the group’s proposed project, the IAEA stated that, “the work [the researchers] will carry out […] will be beneficial not only to our CRP but also to the international fusion community in large.”

    PSFC Director and Hitachi America Professor of Engineering Dennis Whyte adds, “I am thrilled to see PSFC and our collaborators be at the forefront of applying new AI tools while simultaneously encouraging and enabling extraction of critical data from our experiments.”

    “Having the opportunity to lead such an important project is extremely meaningful, and I feel a responsibility to show that women are leaders in STEM,” says Rea. “We have an incredible team, strongly motivated to improve our fusion ecosystem and to contribute to making fusion energy a reality.” More

  • in

    New clean air and water labs to bring together researchers, policymakers to find climate solutions

    MIT’s Abdul Latif Jameel Poverty Action Lab (J-PAL) is launching the Clean Air and Water Labs, with support from Community Jameel, to generate evidence-based solutions aimed at increasing access to clean air and water.

    Led by J-PAL’s Africa, Middle East and North Africa (MENA), and South Asia regional offices, the labs will partner with government agencies to bring together researchers and policymakers in areas where impactful clean air and water solutions are most urgently needed.

    Together, the labs aim to improve clean air and water access by informing the scaling of evidence-based policies and decisions of city, state, and national governments that serve nearly 260 million people combined.

    The Clean Air and Water Labs expand the work of J-PAL’s King Climate Action Initiative, building on the foundational support of King Philanthropies, which significantly expanded J-PAL’s work at the nexus of climate change and poverty alleviation worldwide. 

    Air pollution, water scarcity and the need for evidence 

    Africa, MENA, and South Asia are on the front lines of global air and water crises. 

    “There is no time to waste investing in solutions that do not achieve their desired effects,” says Iqbal Dhaliwal, global executive director of J-PAL. “By co-generating rigorous real-world evidence with researchers, policymakers can have the information they need to dedicate resources to scaling up solutions that have been shown to be effective.”

    In India, about 75 percent of households did not have drinking water on premises in 2018. In MENA, nearly 90 percent of children live in areas facing high or extreme water stress. Across Africa, almost 400 million people lack access to safe drinking water. 

    Simultaneously, air pollution is one of the greatest threats to human health globally. In India, extraordinary levels of air pollution are shortening the average life expectancy by five years. In Africa, rising indoor and ambient air pollution contributed to 1.1 million premature deaths in 2019. 

    There is increasing urgency to find high-impact and cost-effective solutions to the worsening threats to human health and resources caused by climate change. However, data and evidence on potential solutions are limited.

    Fostering collaboration to generate policy-relevant evidence 

    The Clean Air and Water Labs will foster deep collaboration between government stakeholders, J-PAL regional offices, and researchers in the J-PAL network. 

    Through the labs, J-PAL will work with policymakers to:

    co-diagnose the most pressing air and water challenges and opportunities for policy innovation;
    expand policymakers’ access to and use of high-quality air and water data;
    co-design potential solutions informed by existing evidence;
    co-generate evidence on promising solutions through rigorous evaluation, leveraging existing and new data sources; and
    support scaling of air and water policies and programs that are found to be effective through evaluation. 
    A research and scaling fund for each lab will prioritize resources for co-generated pilot studies, randomized evaluations, and scaling projects. 

    The labs will also collaborate with C40 Cities, a global network of mayors of the world’s leading cities that are united in action to confront the climate crisis, to share policy-relevant evidence and identify opportunities for potential new connections and research opportunities within India and across Africa.

    This model aims to strengthen the use of evidence in decision-making to ensure solutions are highly effective and to guide research to answer policymakers’ most urgent questions. J-PAL Africa, MENA, and South Asia’s strong on-the-ground presence will further bridge research and policy work by anchoring activities within local contexts. 

    “Communities across the world continue to face challenges in accessing clean air and water, a threat to human safety that has only been exacerbated by the climate crisis, along with rising temperatures and other hazards,” says George Richards, director of Community Jameel. “Through our collaboration with J-PAL and C40 in creating climate policy labs embedded in city, state, and national governments in Africa and South Asia, we are committed to innovative and science-based approaches that can help hundreds of millions of people enjoy healthier lives.”

    J-PAL Africa, MENA, and South Asia will formally launch Clean Air and Water Labs with government partners over the coming months. J-PAL is housed in the MIT Department of Economics, within the School of Humanities, Arts, and Social Sciences. More