More stories

  • in

    The tenured engineers of 2023

    In 2023, MIT granted tenure to nine faculty members across the School of Engineering. This year’s tenured engineers hold appointments in the departments of Biological Engineering, Civil and Environmental Engineering, Electrical Engineering and Computer Science (which reports jointly to the School of Engineering and MIT Schwarzman College of Computing), Materials Science and Engineering, and Mechanical Engineering, as well as the Institute for Medical Engineering and Science (IMES).

    “I am truly inspired by this remarkable group of talented faculty members,” says Anantha Chandrakasan, dean of the School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “The work they are doing, both in the lab and in the classroom, has made a tremendous impact at MIT and in the wider world. Their important research has applications in a diverse range of fields and industries. I am thrilled to congratulate them on the milestone of receiving tenure.”

    This year’s newly tenured engineering faculty include:

    Michael Birnbaum, Class of 1956 Career Development Professor, associate professor of biological engineering, and faculty member at the Koch Institute for Integrative Cancer Research at MIT, works on understanding and manipulating immune recognition in cancer and infections. By using a variety of techniques to study the antigen recognition of T cells, he and his team aim to develop the next generation of immunotherapies.  
    Tamara Broderick, associate professor of electrical engineering and computer science and member of the MIT Laboratory for Information and Decision Systems (LIDS) and the MIT Institute for Data, Systems, and Society (IDSS), works to provide fast and reliable quantification of uncertainty and robustness in modern data analysis procedures. Broderick and her research group develop data analysis tools with applications in fields, including genetics, economics, and assistive technology. 
    Tal Cohen, associate professor of civil and environmental engineering and mechanical engineering, uses nonlinear solid mechanics to understand how materials behave under extreme conditions. By studying material instabilities, extreme dynamic loading conditions, growth, and chemical coupling, Cohen and her team combine theoretical models and experiments to shape our understanding of the observed phenomena and apply those insights in the design and characterization of material systems. 
    Betar Gallant, Class of 1922 Career Development Professor and associate professor of mechanical engineering, develops advanced materials and chemistries for next-generation lithium-ion and lithium primary batteries and electrochemical carbon dioxide mitigation technologies. Her group’s work could lead to higher-energy and more sustainable batteries for electric vehicles, longer-lasting implantable medical devices, and new methods of carbon capture and conversion. 
    Rafael Jaramillo, Thomas Lord Career Development Professor and associate professor of materials science and engineering, studies the synthesis, properties, and applications of electronic materials, particularly chalcogenide compound semiconductors. His work has applications in microelectronics, integrated photonics, telecommunications, and photovoltaics. 
    Benedetto Marelli, associate professor of civil and environmental engineering, conducts research on the synthesis, assembly, and nanomanufacturing of structural biopolymers. He and his research team develop biomaterials for applications in agriculture, food security, and food safety. 
    Ellen Roche, Latham Family Career Development Professor, an associate professor of mechanical engineering, and a core faculty of IMES, designs and develops implantable, biomimetic therapeutic devices and soft robotics that mechanically assist and repair tissue, deliver therapies, and enable enhanced preclinical testing. Her devices have a wide range of applications in human health, including cardiovascular and respiratory disease. 
    Serguei Saavedra, associate professor of civil and environmental engineering, uses systems thinking, synthesis, and mathematical modeling to study the persistence of ecological systems under changing environments. His theoretical research is used to develop hypotheses and corroborate predictions of how ecological systems respond to climate change. 
    Justin Solomon, associate professor of electrical engineering and computer science and member of the MIT Computer Science and Artificial Intelligence Laboratory and MIT Center for Computational Science and Engineering, works at the intersection of geometry, large-scale optimization, computer graphics, and machine learning. His research has diverse applications in machine learning, computer graphics, and geometric data processing.  More

  • in

    Summer research offers a springboard to advanced studies

    Doctoral studies at MIT aren’t a calling for everyone, but they can be for anyone who has had opportunities to discover that science and technology research is their passion and to build the experience and skills to succeed. For Taylor Baum, Josefina Correa Menéndez, and Karla Alejandra Montejo, three graduate students in just one lab of The Picower Institute for Learning and Memory, a pivotal opportunity came via the MIT Summer Research Program in Biology and Neuroscience (MSRP-Bio). When a student finds MSRP-Bio, it helps them find their future in research. 

    In the program, undergraduate STEM majors from outside MIT spend the summer doing full-time research in the departments of Biology, Brain and Cognitive Sciences (BCS), or the Center for Brains, Minds and Machines (CBMM). They gain lab skills, mentoring, preparation for graduate school, and connections that might last a lifetime. Over the last two decades, a total of 215 students from underrepresented minority groups, who are from economically disadvantaged backgrounds, first-generation or nontraditional college students, or students with disabilities have participated in research in BCS or CBMM labs.  

    Like Baum, Correa Menéndez, and Montejo, the vast majority go on to pursue graduate studies, says Diversity and Outreach Coordinator Mandana Sassanfar, who runs the program. For instance, among 91 students who have worked in Picower Institute labs, 81 have completed their undergraduate studies. Of those, 46 enrolled in PhD programs at MIT or other schools such as Cornell, Yale, Stanford, and Princeton universities, and the University of California System. Another 12 have gone to medical school, another seven are in MD/PhD programs, and three have earned master’s degrees. The rest are studying as post-baccalaureates or went straight into the workforce after earning their bachelor’s degree. 

    After participating in the program, Baum, Correa Menéndez, and Montejo each became graduate students in the research group of Emery N. Brown, the Edward Hood Taplin Professor of Computational Neuroscience and Medical Engineering in The Picower Institute and the Institute for Medical Engineering and Science. The lab combines statistical, computational, and experimental neuroscience methods to study how general anesthesia affects the central nervous system to ultimately improve patient care and advance understanding of the brain. Brown says the students have each been doing “off-the-scale” work, in keeping with the excellence he’s seen from MSRP BIO students over the years. For example, on Aug. 10 Baum and Correa Menéndez were honored with MathWorks Fellowships.

    “I think MSRP is fantastic. Mandana does this amazing job of getting students who are quite talented to come to MIT to realize that they can move their game to the next level. They have the capacity to do it. They just need the opportunities,” Brown says. “These students live up to the expectations that you have of them. And now as graduate students, they’re taking on hard problems and they’re solving them.” 

    Paths to PhD studies 

    Pursuing a PhD is hardly a given. Many young students have never considered graduate school or specific fields of study like neuroscience or electrical engineering. But Sassanfar engages students across the country to introduce them to the opportunity MSRP-Bio provides to gain exposure, experience, and mentoring in advanced fields. Every fall, after the program’s students have returned to their undergraduate institutions, she visits schools in places as far flung as Florida, Maryland, Puerto Rico, and Texas and goes to conferences for diverse science communities such as ABRCMS and SACNAS to spread the word. 

    Taylor Baum

    Photo courtesy of Taylor Baum.

    Previous item
    Next item

    When Baum first connected with the program in 2017, she was finding her way at Penn State University. She had been majoring in biology and music composition but had just switched the latter to engineering following a conversation over coffee exposing her to brain-computer interfacing technology, in which detecting brain signals of people with full-body paralysis could improve their quality of life by enabling control of computers or wheelchairs. Baum became enthusiastic about the potential to build similar systems, but as a new engineering student, she struggled to find summer internships and research opportunities. 

    “I got rejected from every single progam except the MIT Center for Brains, Minds and Machines MSRP,” she recalls with a chuckle. 

    Baum thrived in MSRP-Bio, working in Brown’s lab for three successive summers. At each stage, she said, she gained more research skills, experience, and independence. When she graduated, she was sure she wanted to go to graduate school and applied to four of her dream schools. She accepted MIT’s offer to join the Department of Electrical Engineering and Computer Science, where she is co-advised by faculty members there and by Brown. She is now working to develop a system grounded in cardiovascular physiology that can improve blood pressure management. A tool for practicing anesthesiologists, the system automates the dosing of drugs to maintain a patient’s blood pressure at safe levels in the operating room or intensive care unit. 

    More than that, Baum not only is leading an organization advancing STEM education in Puerto Rico, but also is helping to mentor a current MSRP-Bio student in the Brown lab. 

    “MSRP definitely bonds everyone who has participated in it,” Baum says. “If I see anyone who I know participated in MSRP, we could have an immediate conversation. I know that most of us, if we needed help, we’d feel comfortable asking for help from someone from MSRP. With that shared experience, we have a sense of camaraderie, and community.” 

    In fact, a few years ago when a former MSRP-Bio student named Karla Montejo was applying to MIT, Baum provided essential advice and feedback about the application process, Montejo says. Now, as a graduate student, Montejo has become a mentor for the program in her own right, Sassanfar notes. For instance, Montejo serves on program alumni panels that advise new MSRP-Bio students. 

    Karla Alejandra Montejo

    Photo courtesy of Karla Alejandra Montejo.

    Previous item
    Next item

    Montejo’s family immigrated to Miami from Cuba when she was a child. The magnet high school she attended was so new that students were encouraged to help establish the school’s programs. She forged a path into research. 

    “I didn’t even know what research was,” she says. “I wanted to be a doctor, and I thought maybe it would help me on my resume. I thought it would be kind of like shadowing, but no, it was really different. So I got really captured by research when I was in high school.” 

    Despite continuing to pursue research in college at Florida International University, Montejo didn’t get into graduate school on her first attempt because she hadn’t yet learned how to focus her application. But Sassanfar had visited FIU to recruit students and through that relationship Montejo had already gone through MIT’s related Quantitative Methods Workshop (QMW). So Montejo enrolled in MSRP-Bio, working in the CBMM-affiliated lab of Gabriel Kreiman at Boston Children’s Hospital. 

    “I feel like Mandana really helped me out, gave me a break, and the MSRP experience pretty much solidified that I really wanted to come to MIT,” Montejo says. 

    In the QMW, Montejo learned she really liked computational neuroscience, and in Kreiman’s lab she got to try her hand at computational modeling of the cognition involved in making perceptual sense of complex scenes. Montejo realized she wanted to work on more biologically based neuroscience problems. When the summer ended, because she was off the normal graduate school cycle for now, she found a two-year post-baccalaurate program at Mayo Clinic studying the role a brain cell type called astrocytes might have in the Parkinson’s disease treatment deep brain stimulation. 

    When it came time to reapply to graduate schools (with the help of Baum and others in the BCS Application Assistance Program) Montejo applied to MIT and got in, joining the Brown lab. Now she’s working on modeling the role of  metabolic processes in the changing of brain rhythms under anesthesia, taking advantage of how general anesthesia predictably changes brain states. The effects anesthetic drugs have on cell metabolism and the way that ultimately affects levels of consciousness reveals important aspects of how metabolism affects brain circuits and systems. Earlier this month, for instance, Montejo co-led a paper the lab published in The Proceedings of the National Academy of Sciences detailing the neuroscience of a patient’s transition into an especially deep state of unconsciousness called “burst suppression.” 

    Josefina Correa Menendez

    Photo: David Orenstein

    Previous item
    Next item

    A signature of the Brown lab’s work is rigorous statistical analysis and methods, for instance to discern brain arousal states from EEG measures of brain rhythms. A PhD candidate in MIT’s Interdisciplinary Doctoral Program in Statistics, Correa Menéndez is advancing the use of Bayesian hierarchical models for neural data analysis. These statistical models offer a principled way of pooling information across datasets. One of her models can help scientists better understand the way neurons can “spike” with electrical activity when the brain is presented with a stimulus. The other’s power is in discerning critical features such as arousal states of the brain under general anesthesia from electrophysiological recordings. 

    Though she now works with complex equations and computations as a PhD candidate in neuroscience and statistics, Correa Menéndez was mostly interested in music art as a high school student at Academia María Reina in San Juan and then architecture in college at the University of Puerto Rico at Río Piedras. It was discussions at the intersection of epistemology and art during an art theory class that inspired Correa Menéndez to switch her major to biology and to take computer science classes, too. 

    When Sassanfar visited Puerto Rico in 2017, a computer science professor (Patricia Ordóñez) suggested that Correa Menéndez apply for a chance to attend the QMW. She did, and that led her to also participate in MSRP-Bio in the lab of Sherman Fairchild Professor Matt Wilson (a faculty member in BCS, CBMM, and the Picower Institute). She joined in the lab’s studies of how spatial memories are represented in the hippocampus and how the brain makes use of those memories to help understand the world around it. With mentoring from then-postdoc Carmen Varela (now a faculty member at Florida State University), the experience not only exposed her to neuroscience, but also helped her gain skills and experience with lab experiments, building research tools, and conducting statistical analyses. She ended up working in the Wilson lab as a research scholar for a year and began her graduate studies in September 2018.  

    Classes she took with Brown as a research scholar inspired her to join his lab as a graduate student. 

    “Taking the classes with Emery and also doing experiments made me aware of the role of statistics in the scientific process: from the interpretation of results to the analysis and the design of experiments,” she says. “More often than not, in science, statistics becomes this sort of afterthought — this ‘annoying’ thing that people need to do to get their paper published. But statistics as a field is actually a lot more than that. It’s a way of thinking about data. Particularly, Bayesian modeling provides a principled inference framework for combining prior knowledge into a hypothesis that you can test with data.” 

    To be sure, no one starts out with such inspiration about scientific scholarship, but MSRP-Bio helps students find that passion for research and the paths that opens up.   More

  • in

    The curse of variety in transportation systems

    Cathy Wu has always delighted in systems that run smoothly. In high school, she designed a project to optimize the best route for getting to class on time. Her research interests and career track are evidence of a propensity for organizing and optimizing, coupled with a strong sense of responsibility to contribute to society instilled by her parents at a young age.

    As an undergraduate at MIT, Wu explored domains like agriculture, energy, and education, eventually homing in on transportation. “Transportation touches each of our lives,” she says. “Every day, we experience the inefficiencies and safety issues as well as the environmental harms associated with our transportation systems. I believe we can and should do better.”

    But doing so is complicated. Consider the long-standing issue of traffic systems control. Wu explains that it is not one problem, but more accurately a family of control problems impacted by variables like time of day, weather, and vehicle type — not to mention the types of sensing and communication technologies used to measure roadway information. Every differentiating factor introduces an exponentially larger set of control problems. There are thousands of control-problem variations and hundreds, if not thousands, of studies and papers dedicated to each problem. Wu refers to the sheer number of variations as the curse of variety — and it is hindering innovation.

    Play video

    “To prove that a new control strategy can be safely deployed on our streets can take years. As time lags, we lose opportunities to improve safety and equity while mitigating environmental impacts. Accelerating this process has huge potential,” says Wu.  

    Which is why she and her group in the MIT Laboratory for Information and Decision Systems are devising machine learning-based methods to solve not just a single control problem or a single optimization problem, but families of control and optimization problems at scale. “In our case, we’re examining emerging transportation problems that people have spent decades trying to solve with classical approaches. It seems to me that we need a different approach.”

    Optimizing intersections

    Currently, Wu’s largest research endeavor is called Project Greenwave. There are many sectors that directly contribute to climate change, but transportation is responsible for the largest share of greenhouse gas emissions — 29 percent, of which 81 percent is due to land transportation. And while much of the conversation around mitigating environmental impacts related to mobility is focused on electric vehicles (EVs), electrification has its drawbacks. EV fleet turnover is time-consuming (“on the order of decades,” says Wu), and limited global access to the technology presents a significant barrier to widespread adoption.

    Wu’s research, on the other hand, addresses traffic control problems by leveraging deep reinforcement learning. Specifically, she is looking at traffic intersections — and for good reason. In the United States alone, there are more than 300,000 signalized intersections where vehicles must stop or slow down before re-accelerating. And every re-acceleration burns fossil fuels and contributes to greenhouse gas emissions.

    Highlighting the magnitude of the issue, Wu says, “We have done preliminary analysis indicating that up to 15 percent of land transportation CO2 is wasted through energy spent idling and re-accelerating at intersections.”

    To date, she and her group have modeled 30,000 different intersections across 10 major metropolitan areas in the United States. That is 30,000 different configurations, roadway topologies (e.g., grade of road or elevation), different weather conditions, and variations in travel demand and fuel mix. Each intersection and its corresponding scenarios represents a unique multi-agent control problem.

    Wu and her team are devising techniques that can solve not just one, but a whole family of problems comprised of tens of thousands of scenarios. Put simply, the idea is to coordinate the timing of vehicles so they arrive at intersections when traffic lights are green, thereby eliminating the start, stop, re-accelerate conundrum. Along the way, they are building an ecosystem of tools, datasets, and methods to enable roadway interventions and impact assessments of strategies to significantly reduce carbon-intense urban driving.

    Play video

    Their collaborator on the project is the Utah Department of Transportation, which Wu says has played an essential role, in part by sharing data and practical knowledge that she and her group otherwise would not have been able to access publicly.

    “I appreciate industry and public sector collaborations,” says Wu. “When it comes to important societal problems, one really needs grounding with practitioners. One needs to be able to hear the perspectives in the field. My interactions with practitioners expand my horizons and help ground my research. You never know when you’ll hear the perspective that is the key to the solution, or perhaps the key to understanding the problem.”

    Finding the best routes

    In a similar vein, she and her research group are tackling large coordination problems. For example, vehicle routing. “Every day, delivery trucks route more than a hundred thousand packages for the city of Boston alone,” says Wu. Accomplishing the task requires, among other things, figuring out which trucks to use, which packages to deliver, and the order in which to deliver them as efficiently as possible. If and when the trucks are electrified, they will need to be charged, adding another wrinkle to the process and further complicating route optimization.

    The vehicle routing problem, and therefore the scope of Wu’s work, extends beyond truck routing for package delivery. Ride-hailing cars may need to pick up objects as well as drop them off; and what if delivery is done by bicycle or drone? In partnership with Amazon, for example, Wu and her team addressed routing and path planning for hundreds of robots (up to 800) in their warehouses.

    Every variation requires custom heuristics that are expensive and time-consuming to develop. Again, this is really a family of problems — each one complicated, time-consuming, and currently unsolved by classical techniques — and they are all variations of a central routing problem. The curse of variety meets operations and logistics.

    By combining classical approaches with modern deep-learning methods, Wu is looking for a way to automatically identify heuristics that can effectively solve all of these vehicle routing problems. So far, her approach has proved successful.

    “We’ve contributed hybrid learning approaches that take existing solution methods for small problems and incorporate them into our learning framework to scale and accelerate that existing solver for large problems. And we’re able to do this in a way that can automatically identify heuristics for specialized variations of the vehicle routing problem.” The next step, says Wu, is applying a similar approach to multi-agent robotics problems in automated warehouses.

    Wu and her group are making big strides, in part due to their dedication to use-inspired basic research. Rather than applying known methods or science to a problem, they develop new methods, new science, to address problems. The methods she and her team employ are necessitated by societal problems with practical implications. The inspiration for the approach? None other than Louis Pasteur, who described his research style in a now-famous article titled “Pasteur’s Quadrant.” Anthrax was decimating the sheep population, and Pasteur wanted to better understand why and what could be done about it. The tools of the time could not solve the problem, so he invented a new field, microbiology, not out of curiosity but out of necessity. More

  • in

    Making sense of cell fate

    Despite the proliferation of novel therapies such as immunotherapy or targeted therapies, radiation and chemotherapy remain the frontline treatment for cancer patients. About half of all patients still receive radiation and 60-80 percent receive chemotherapy.

    Both radiation and chemotherapy work by damaging DNA, taking advantage of a vulnerability specific to cancer cells. Healthy cells are more likely to survive radiation and chemotherapy since their mechanisms for identifying and repairing DNA damage are intact. In cancer cells, these repair mechanisms are compromised by mutations. When cancer cells cannot adequately respond to the DNA damage caused by radiation and chemotherapy, ideally, they undergo apoptosis or die by other means.

    However, there is another fate for cells after DNA damage: senescence — a state where cells survive, but stop dividing. Senescent cells’ DNA has not been damaged enough to induce apoptosis but is too damaged to support cell division. While senescent cancer cells themselves are unable to proliferate and spread, they are bad actors in the fight against cancer because they seem to enable other cancer cells to develop more aggressively. Although a cancer cell’s fate is not apparent until a few days after treatment, the decision to survive, die, or enter senescence is made much earlier. But, precisely when and how that decision is made has not been well understood.

    In an open-access study of ovarian and osteosarcoma cancer cells appearing July 19 in Cell Systems, MIT researchers show that cell signaling proteins commonly associated with cell proliferation and apoptosis instead commit cancer cells to senescence within 12 hours of treatment with low doses of certain kinds of chemotherapy.

    “When it comes to treating cancer, this study underscores that it’s important not to think too linearly about cell signaling,” says Michael Yaffe, who is a David H. Koch Professor of Science at MIT, the director of the MIT Center for Precision Cancer Medicine, a member of MIT’s Koch Institute for Integrative Cancer Research, and the senior author of the study. “If you assume that a particular treatment will always affect cancer cell signaling in the same way — you may be setting yourself up for many surprises, and treating cancers with the wrong combination of drugs.”

    Using a combination of experiments with cancer cells and computational modeling, the team investigated the cell signaling mechanisms that prompt cancer cells to enter senescence after treatment with a commonly used anti-cancer agent. Their efforts singled out two protein kinases and a component of the AP-1 transcription factor complex as highly associated with the induction of senescence after DNA damage, despite the well-established roles for all of these molecules in promoting cell proliferation in cancer.

    The researchers treated cancer cells with low and high doses of doxorubicin, a chemotherapy that interferes with the function with topoisomerase II, an enzyme that breaks and then repairs DNA strands during replication to fix tangles and other topological problems.

    By measuring the effects of DNA damage on single cells at several time points ranging from six hours to four days after the initial exposure, the team created two datasets. In one dataset, the researchers tracked cell fate over time. For the second set, researchers measured relative cell signaling activity levels across a variety of proteins associated with responses to DNA damage or cellular stress, determination of cell fate, and progress through cell growth and division.

    The two datasets were used to build a computational model that identifies correlations between time, dosage, signal, and cell fate. The model identified the activities of the MAP kinases Erk and JNK, and the transcription factor c-Jun as key components of the AP-1 protein likewise understood to involved in the induction of senescence. The researchers then validated these computational findings by showing that inhibition of JNK and Erk after DNA damage successfully prevented cells from entering senescence.

    The researchers leveraged JNK and Erk inhibition to pinpoint exactly when cells made the decision to enter senescence. Surprisingly, they found that the decision to enter senescence was made within 12 hours of DNA damage, even though it took days to actually see the senescent cells accumulate. The team also found that with the passage of more time, these MAP kinases took on a different function: promoting the secretion of proinflammatory proteins called cytokines that are responsible for making other cancer cells proliferate and develop resistance to chemotherapy.

    “Proteins like cytokines encourage ‘bad behavior’ in neighboring tumor cells that lead to more aggressive cancer progression,” says Tatiana Netterfield, a graduate student in the Yaffe lab and the lead author of the study. “Because of this, it is thought that senescent cells that stay near the tumor for long periods of time are detrimental to treating cancer.”

    This study’s findings apply to cancer cells treated with a commonly used type of chemotherapy that stalls DNA replication after repair. But more broadly, the study emphasizes that “when treating cancer, it’s extremely important to understand the molecular characteristics of cancer cells and the contextual factors such as time and dosing that determine cell fate,” explains Netterfield.

    The study, however, has more immediate implications for treatments that are already in use. One class of Erk inhibitors, MEK inhibitors, are used in the clinic with the expectation that they will curb cancer growth.

    “We must be cautious about administering MEK inhibitors together with chemotherapies,” says Yaffe. “The combination may have the unintended effect of driving cells into proliferation, rather than senescence.”

    In future work, the team will perform studies to understand how and why individual cells choose to proliferate instead of enter senescence. Additionally, the team is employing next-generation sequencing to understand which genes c-Jun is regulating in order to push cells toward senescence.

    This study was funded, in part, by the Charles and Marjorie Holloway Foundation and the MIT Center for Precision Cancer Medicine. More

  • in

    A simpler method for learning to control a robot

    Researchers from MIT and Stanford University have devised a new machine-learning approach that could be used to control a robot, such as a drone or autonomous vehicle, more effectively and efficiently in dynamic environments where conditions can change rapidly.

    This technique could help an autonomous vehicle learn to compensate for slippery road conditions to avoid going into a skid, allow a robotic free-flyer to tow different objects in space, or enable a drone to closely follow a downhill skier despite being buffeted by strong winds.

    The researchers’ approach incorporates certain structure from control theory into the process for learning a model in such a way that leads to an effective method of controlling complex dynamics, such as those caused by impacts of wind on the trajectory of a flying vehicle. One way to think about this structure is as a hint that can help guide how to control a system.

    “The focus of our work is to learn intrinsic structure in the dynamics of the system that can be leveraged to design more effective, stabilizing controllers,” says Navid Azizan, the Esther and Harold E. Edgerton Assistant Professor in the MIT Department of Mechanical Engineering and the Institute for Data, Systems, and Society (IDSS), and a member of the Laboratory for Information and Decision Systems (LIDS). “By jointly learning the system’s dynamics and these unique control-oriented structures from data, we’re able to naturally create controllers that function much more effectively in the real world.”

    Using this structure in a learned model, the researchers’ technique immediately extracts an effective controller from the model, as opposed to other machine-learning methods that require a controller to be derived or learned separately with additional steps. With this structure, their approach is also able to learn an effective controller using fewer data than other approaches. This could help their learning-based control system achieve better performance faster in rapidly changing environments.

    “This work tries to strike a balance between identifying structure in your system and just learning a model from data,” says lead author Spencer M. Richards, a graduate student at Stanford University. “Our approach is inspired by how roboticists use physics to derive simpler models for robots. Physical analysis of these models often yields a useful structure for the purposes of control — one that you might miss if you just tried to naively fit a model to data. Instead, we try to identify similarly useful structure from data that indicates how to implement your control logic.”

    Additional authors of the paper are Jean-Jacques Slotine, professor of mechanical engineering and of brain and cognitive sciences at MIT, and Marco Pavone, associate professor of aeronautics and astronautics at Stanford. The research will be presented at the International Conference on Machine Learning (ICML).

    Learning a controller

    Determining the best way to control a robot to accomplish a given task can be a difficult problem, even when researchers know how to model everything about the system.

    A controller is the logic that enables a drone to follow a desired trajectory, for example. This controller would tell the drone how to adjust its rotor forces to compensate for the effect of winds that can knock it off a stable path to reach its goal.

    This drone is a dynamical system — a physical system that evolves over time. In this case, its position and velocity change as it flies through the environment. If such a system is simple enough, engineers can derive a controller by hand. 

    Modeling a system by hand intrinsically captures a certain structure based on the physics of the system. For instance, if a robot were modeled manually using differential equations, these would capture the relationship between velocity, acceleration, and force. Acceleration is the rate of change in velocity over time, which is determined by the mass of and forces applied to the robot.

    But often the system is too complex to be exactly modeled by hand. Aerodynamic effects, like the way swirling wind pushes a flying vehicle, are notoriously difficult to derive manually, Richards explains. Researchers would instead take measurements of the drone’s position, velocity, and rotor speeds over time, and use machine learning to fit a model of this dynamical system to the data. But these approaches typically don’t learn a control-based structure. This structure is useful in determining how to best set the rotor speeds to direct the motion of the drone over time.

    Once they have modeled the dynamical system, many existing approaches also use data to learn a separate controller for the system.

    “Other approaches that try to learn dynamics and a controller from data as separate entities are a bit detached philosophically from the way we normally do it for simpler systems. Our approach is more reminiscent of deriving models by hand from physics and linking that to control,” Richards says.

    Identifying structure

    The team from MIT and Stanford developed a technique that uses machine learning to learn the dynamics model, but in such a way that the model has some prescribed structure that is useful for controlling the system.

    With this structure, they can extract a controller directly from the dynamics model, rather than using data to learn an entirely separate model for the controller.

    “We found that beyond learning the dynamics, it’s also essential to learn the control-oriented structure that supports effective controller design. Our approach of learning state-dependent coefficient factorizations of the dynamics has outperformed the baselines in terms of data efficiency and tracking capability, proving to be successful in efficiently and effectively controlling the system’s trajectory,” Azizan says. 

    When they tested this approach, their controller closely followed desired trajectories, outpacing all the baseline methods. The controller extracted from their learned model nearly matched the performance of a ground-truth controller, which is built using the exact dynamics of the system.

    “By making simpler assumptions, we got something that actually worked better than other complicated baseline approaches,” Richards adds.

    The researchers also found that their method was data-efficient, which means it achieved high performance even with few data. For instance, it could effectively model a highly dynamic rotor-driven vehicle using only 100 data points. Methods that used multiple learned components saw their performance drop much faster with smaller datasets.

    This efficiency could make their technique especially useful in situations where a drone or robot needs to learn quickly in rapidly changing conditions.

    Plus, their approach is general and could be applied to many types of dynamical systems, from robotic arms to free-flying spacecraft operating in low-gravity environments.

    In the future, the researchers are interested in developing models that are more physically interpretable, and that would be able to identify very specific information about a dynamical system, Richards says. This could lead to better-performing controllers.

    “Despite its ubiquity and importance, nonlinear feedback control remains an art, making it especially suitable for data-driven and learning-based methods. This paper makes a significant contribution to this area by proposing a method that jointly learns system dynamics, a controller, and control-oriented structure,” says Nikolai Matni, an assistant professor in the Department of Electrical and Systems Engineering at the University of Pennsylvania, who was not involved with this work. “What I found particularly exciting and compelling was the integration of these components into a joint learning algorithm, such that control-oriented structure acts as an inductive bias in the learning process. The result is a data-efficient learning process that outputs dynamic models that enjoy intrinsic structure that enables effective, stable, and robust control. While the technical contributions of the paper are excellent themselves, it is this conceptual contribution that I view as most exciting and significant.”

    This research is supported, in part, by the NASA University Leadership Initiative and the Natural Sciences and Engineering Research Council of Canada. More

  • in

    A faster way to teach a robot

    Imagine purchasing a robot to perform household tasks. This robot was built and trained in a factory on a certain set of tasks and has never seen the items in your home. When you ask it to pick up a mug from your kitchen table, it might not recognize your mug (perhaps because this mug is painted with an unusual image, say, of MIT’s mascot, Tim the Beaver). So, the robot fails.

    “Right now, the way we train these robots, when they fail, we don’t really know why. So you would just throw up your hands and say, ‘OK, I guess we have to start over.’ A critical component that is missing from this system is enabling the robot to demonstrate why it is failing so the user can give it feedback,” says Andi Peng, an electrical engineering and computer science (EECS) graduate student at MIT.

    Peng and her collaborators at MIT, New York University, and the University of California at Berkeley created a framework that enables humans to quickly teach a robot what they want it to do, with a minimal amount of effort.

    When a robot fails, the system uses an algorithm to generate counterfactual explanations that describe what needed to change for the robot to succeed. For instance, maybe the robot would have been able to pick up the mug if the mug were a certain color. It shows these counterfactuals to the human and asks for feedback on why the robot failed. Then the system utilizes this feedback and the counterfactual explanations to generate new data it uses to fine-tune the robot.

    Fine-tuning involves tweaking a machine-learning model that has already been trained to perform one task, so it can perform a second, similar task.

    The researchers tested this technique in simulations and found that it could teach a robot more efficiently than other methods. The robots trained with this framework performed better, while the training process consumed less of a human’s time.

    This framework could help robots learn faster in new environments without requiring a user to have technical knowledge. In the long run, this could be a step toward enabling general-purpose robots to efficiently perform daily tasks for the elderly or individuals with disabilities in a variety of settings.

    Peng, the lead author, is joined by co-authors Aviv Netanyahu, an EECS graduate student; Mark Ho, an assistant professor at the Stevens Institute of Technology; Tianmin Shu, an MIT postdoc; Andreea Bobu, a graduate student at UC Berkeley; and senior authors Julie Shah, an MIT professor of aeronautics and astronautics and the director of the Interactive Robotics Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Pulkit Agrawal, a professor in CSAIL. The research will be presented at the International Conference on Machine Learning.

    On-the-job training

    Robots often fail due to distribution shift — the robot is presented with objects and spaces it did not see during training, and it doesn’t understand what to do in this new environment.

    One way to retrain a robot for a specific task is imitation learning. The user could demonstrate the correct task to teach the robot what to do. If a user tries to teach a robot to pick up a mug, but demonstrates with a white mug, the robot could learn that all mugs are white. It may then fail to pick up a red, blue, or “Tim-the-Beaver-brown” mug.

    Training a robot to recognize that a mug is a mug, regardless of its color, could take thousands of demonstrations.

    “I don’t want to have to demonstrate with 30,000 mugs. I want to demonstrate with just one mug. But then I need to teach the robot so it recognizes that it can pick up a mug of any color,” Peng says.

    To accomplish this, the researchers’ system determines what specific object the user cares about (a mug) and what elements aren’t important for the task (perhaps the color of the mug doesn’t matter). It uses this information to generate new, synthetic data by changing these “unimportant” visual concepts. This process is known as data augmentation.

    The framework has three steps. First, it shows the task that caused the robot to fail. Then it collects a demonstration from the user of the desired actions and generates counterfactuals by searching over all features in the space that show what needed to change for the robot to succeed.

    The system shows these counterfactuals to the user and asks for feedback to determine which visual concepts do not impact the desired action. Then it uses this human feedback to generate many new augmented demonstrations.

    In this way, the user could demonstrate picking up one mug, but the system would produce demonstrations showing the desired action with thousands of different mugs by altering the color. It uses these data to fine-tune the robot.

    Creating counterfactual explanations and soliciting feedback from the user are critical for the technique to succeed, Peng says.

    From human reasoning to robot reasoning

    Because their work seeks to put the human in the training loop, the researchers tested their technique with human users. They first conducted a study in which they asked people if counterfactual explanations helped them identify elements that could be changed without affecting the task.

    “It was so clear right off the bat. Humans are so good at this type of counterfactual reasoning. And this counterfactual step is what allows human reasoning to be translated into robot reasoning in a way that makes sense,” she says.

    Then they applied their framework to three simulations where robots were tasked with: navigating to a goal object, picking up a key and unlocking a door, and picking up a desired object then placing it on a tabletop. In each instance, their method enabled the robot to learn faster than with other techniques, while requiring fewer demonstrations from users.

    Moving forward, the researchers hope to test this framework on real robots. They also want to focus on reducing the time it takes the system to create new data using generative machine-learning models.

    “We want robots to do what humans do, and we want them to do it in a semantically meaningful way. Humans tend to operate in this abstract space, where they don’t think about every single property in an image. At the end of the day, this is really about enabling a robot to learn a good, human-like representation at an abstract level,” Peng says.

    This research is supported, in part, by a National Science Foundation Graduate Research Fellowship, Open Philanthropy, an Apple AI/ML Fellowship, Hyundai Motor Corporation, the MIT-IBM Watson AI Lab, and the National Science Foundation Institute for Artificial Intelligence and Fundamental Interactions. More

  • in

    A new way to look at data privacy

    Imagine that a team of scientists has developed a machine-learning model that can predict whether a patient has cancer from lung scan images. They want to share this model with hospitals around the world so clinicians can start using it in diagnosis.

    But there’s a problem. To teach their model how to predict cancer, they showed it millions of real lung scan images, a process called training. Those sensitive data, which are now encoded into the inner workings of the model, could potentially be extracted by a malicious agent. The scientists can prevent this by adding noise, or more generic randomness, to the model that makes it harder for an adversary to guess the original data. However, perturbation reduces a model’s accuracy, so the less noise one can add, the better.

    MIT researchers have developed a technique that enables the user to potentially add the smallest amount of noise possible, while still ensuring the sensitive data are protected.

    The researchers created a new privacy metric, which they call Probably Approximately Correct (PAC) Privacy, and built a framework based on this metric that can automatically determine the minimal amount of noise that needs to be added. Moreover, this framework does not need knowledge of the inner workings of a model or its training process, which makes it easier to use for different types of models and applications.

    In several cases, the researchers show that the amount of noise required to protect sensitive data from adversaries is far less with PAC Privacy than with other approaches. This could help engineers create machine-learning models that provably hide training data, while maintaining accuracy in real-world settings.

    “PAC Privacy exploits the uncertainty or entropy of the sensitive data in a meaningful way,  and this allows us to add, in many cases, an order of magnitude less noise. This framework allows us to understand the characteristics of arbitrary data processing and privatize it automatically without artificial modifications. While we are in the early days and we are doing simple examples, we are excited about the promise of this technique,” says Srini Devadas, the Edwin Sibley Webster Professor of Electrical Engineering and co-author of a new paper on PAC Privacy.

    Devadas wrote the paper with lead author Hanshen Xiao, an electrical engineering and computer science graduate student. The research will be presented at the International Cryptography Conference (Crypto 2023).

    Defining privacy

    A fundamental question in data privacy is: How much sensitive data could an adversary recover from a machine-learning model with noise added to it?

    Differential Privacy, one popular privacy definition, says privacy is achieved if an adversary who observes the released model cannot infer whether an arbitrary individual’s data is used for the training processing. But provably preventing an adversary from distinguishing data usage often requires large amounts of noise to obscure it. This noise reduces the model’s accuracy.

    PAC Privacy looks at the problem a bit differently. It characterizes how hard it would be for an adversary to reconstruct any part of randomly sampled or generated sensitive data after noise has been added, rather than only focusing on the distinguishability problem.

    For instance, if the sensitive data are images of human faces, differential privacy would focus on whether the adversary can tell if someone’s face was in the dataset. PAC Privacy, on the other hand, could look at whether an adversary could extract a silhouette — an approximation — that someone could recognize as a particular individual’s face.

    Once they established the definition of PAC Privacy, the researchers created an algorithm that automatically tells the user how much noise to add to a model to prevent an adversary from confidently reconstructing a close approximation of the sensitive data. This algorithm guarantees privacy even if the adversary has infinite computing power, Xiao says.

    To find the optimal amount of noise, the PAC Privacy algorithm relies on the uncertainty, or entropy, in the original data from the viewpoint of the adversary.

    This automatic technique takes samples randomly from a data distribution or a large data pool and runs the user’s machine-learning training algorithm on that subsampled data to produce an output learned model. It does this many times on different subsamplings and compares the variance across all outputs. This variance determines how much noise one must add — a smaller variance means less noise is needed.

    Algorithm advantages

    Different from other privacy approaches, the PAC Privacy algorithm does not need knowledge of the inner workings of a model, or the training process.

    When implementing PAC Privacy, a user can specify their desired level of confidence at the outset. For instance, perhaps the user wants a guarantee that an adversary will not be more than 1 percent confident that they have successfully reconstructed the sensitive data to within 5 percent of its actual value. The PAC Privacy algorithm automatically tells the user the optimal amount of noise that needs to be added to the output model before it is shared publicly, in order to achieve those goals.

    “The noise is optimal, in the sense that if you add less than we tell you, all bets could be off. But the effect of adding noise to neural network parameters is complicated, and we are making no promises on the utility drop the model may experience with the added noise,” Xiao says.

    This points to one limitation of PAC Privacy — the technique does not tell the user how much accuracy the model will lose once the noise is added. PAC Privacy also involves repeatedly training a machine-learning model on many subsamplings of data, so it can be computationally expensive.  

    To improve PAC Privacy, one approach is to modify a user’s machine-learning training process so it is more stable, meaning that the output model it produces does not change very much when the input data is subsampled from a data pool.  This stability would create smaller variances between subsample outputs, so not only would the PAC Privacy algorithm need to be run fewer times to identify the optimal amount of noise, but it would also need to add less noise.

    An added benefit of stabler models is that they often have less generalization error, which means they can make more accurate predictions on previously unseen data, a win-win situation between machine learning and privacy, Devadas adds.

    “In the next few years, we would love to look a little deeper into this relationship between stability and privacy, and the relationship between privacy and generalization error. We are knocking on a door here, but it is not clear yet where the door leads,” he says.

    “Obfuscating the usage of an individual’s data in a model is paramount to protecting their privacy. However, to do so can come at the cost of the datas’ and therefore model’s utility,” says Jeremy Goodsitt, senior machine learning engineer at Capital One, who was not involved with this research. “PAC provides an empirical, black-box solution, which can reduce the added noise compared to current practices while maintaining equivalent privacy guarantees. In addition, its empirical approach broadens its reach to more data consuming applications.”

    This research is funded, in part, by DSTA Singapore, Cisco Systems, Capital One, and a MathWorks Fellowship. More

  • in

    3 Questions: Honing robot perception and mapping

    Walking to a friend’s house or browsing the aisles of a grocery store might feel like simple tasks, but they in fact require sophisticated capabilities. That’s because humans are able to effortlessly understand their surroundings and detect complex information about patterns, objects, and their own location in the environment.

    What if robots could perceive their environment in a similar way? That question is on the minds of MIT Laboratory for Information and Decision Systems (LIDS) researchers Luca Carlone and Jonathan How. In 2020, a team led by Carlone released the first iteration of Kimera, an open-source library that enables a single robot to construct a three-dimensional map of its environment in real time, while labeling different objects in view. Last year, Carlone’s and How’s research groups (SPARK Lab and Aerospace Controls Lab) introduced Kimera-Multi, an updated system in which multiple robots communicate among themselves in order to create a unified map. A 2022 paper associated with the project recently received this year’s IEEE Transactions on Robotics King-Sun Fu Memorial Best Paper Award, given to the best paper published in the journal in 2022.

    Carlone, who is the Leonardo Career Development Associate Professor of Aeronautics and Astronautics, and How, the Richard Cockburn Maclaurin Professor in Aeronautics and Astronautics, spoke to LIDS about Kimera-Multi and the future of how robots might perceive and interact with their environment.

    Q: Currently your labs are focused on increasing the number of robots that can work together in order to generate 3D maps of the environment. What are some potential advantages to scaling this system?

    How: The key benefit hinges on consistency, in the sense that a robot can create an independent map, and that map is self-consistent but not globally consistent. We’re aiming for the team to have a consistent map of the world; that’s the key difference in trying to form a consensus between robots as opposed to mapping independently.

    Carlone: In many scenarios it’s also good to have a bit of redundancy. For example, if we deploy a single robot in a search-and-rescue mission, and something happens to that robot, it would fail to find the survivors. If multiple robots are doing the exploring, there’s a much better chance of success. Scaling up the team of robots also means that any given task may be completed in a shorter amount of time.

    Q: What are some of the lessons you’ve learned from recent experiments, and challenges you’ve had to overcome while designing these systems?

    Carlone: Recently we did a big mapping experiment on the MIT campus, in which eight robots traversed up to 8 kilometers in total. The robots have no prior knowledge of the campus, and no GPS. Their main tasks are to estimate their own trajectory and build a map around it. You want the robots to understand the environment as humans do; humans not only understand the shape of obstacles, to get around them without hitting them, but also understand that an object is a chair, a desk, and so on. There’s the semantics part.

    The interesting thing is that when the robots meet each other, they exchange information to improve their map of the environment. For instance, if robots connect, they can leverage information to correct their own trajectory. The challenge is that if you want to reach a consensus between robots, you don’t have the bandwidth to exchange too much data. One of the key contributions of our 2022 paper is to deploy a distributed protocol, in which robots exchange limited information but can still agree on how the map looks. They don’t send camera images back and forth but only exchange specific 3D coordinates and clues extracted from the sensor data. As they continue to exchange such data, they can form a consensus.

    Right now we are building color-coded 3D meshes or maps, in which the color contains some semantic information, like “green” corresponds to grass, and “magenta” to a building. But as humans, we have a much more sophisticated understanding of reality, and we have a lot of prior knowledge about relationships between objects. For instance, if I was looking for a bed, I would go to the bedroom instead of exploring the entire house. If you start to understand the complex relationships between things, you can be much smarter about what the robot can do in the environment. We’re trying to move from capturing just one layer of semantics, to a more hierarchical representation in which the robots understand rooms, buildings, and other concepts.

    Q: What kinds of applications might Kimera and similar technologies lead to in the future?

    How: Autonomous vehicle companies are doing a lot of mapping of the world and learning from the environments they’re in. The holy grail would be if these vehicles could communicate with each other and share information, then they could improve models and maps that much quicker. The current solutions out there are individualized. If a truck pulls up next to you, you can’t see in a certain direction. Could another vehicle provide a field of view that your vehicle otherwise doesn’t have? This is a futuristic idea because it requires vehicles to communicate in new ways, and there are privacy issues to overcome. But if we could resolve those issues, you could imagine a significantly improved safety situation, where you have access to data from multiple perspectives, not only your field of view.

    Carlone: These technologies will have a lot of applications. Earlier I mentioned search and rescue. Imagine that you want to explore a forest and look for survivors, or map buildings after an earthquake in a way that can help first responders access people who are trapped. Another setting where these technologies could be applied is in factories. Currently, robots that are deployed in factories are very rigid. They follow patterns on the floor, and are not really able to understand their surroundings. But if you’re thinking about much more flexible factories in the future, robots will have to cooperate with humans and exist in a much less structured environment. More