Data Management & Statistics Archivi - Page 41 of 58 - technology-news.space - All about the world of technology!

More stories

63 Shares149 Views
in Data Management & Statistics
How quickly do algorithms improve?
by Markus Andrews 20 September 2021, 22:00
Algorithms are sort of like a parent to a computer. They tell the computer how to make sense of information so they can, in turn, make something useful out of it.
The more efficient the algorithm, the less work the computer has to do. For all of the technological progress in computing hardware, and the much debated lifespan of Moore’s Law, computer performance is only one side of the picture.
Behind the scenes a second trend is happening: Algorithms are being improved, so in turn less computing power is needed. While algorithmic efficiency may have less of a spotlight, you’d definitely notice if your trusty search engine suddenly became one-tenth as fast, or if moving through big datasets felt like wading through sludge.
This led scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) to ask: How quickly do algorithms improve?
Existing data on this question were largely anecdotal, consisting of case studies of particular algorithms that were assumed to be representative of the broader scope. Faced with this dearth of evidence, the team set off to crunch data from 57 textbooks and more than 1,110 research papers, to trace the history of when algorithms got better. Some of the research papers directly reported how good new algorithms were, and others needed to be reconstructed by the authors using “pseudocode,” shorthand versions of the algorithm that describe the basic details.
In total, the team looked at 113 “algorithm families,” sets of algorithms solving the same problem that had been highlighted as most important by computer science textbooks. For each of the 113, the team reconstructed its history, tracking each time a new algorithm was proposed for the problem and making special note of those that were more efficient. Ranging in performance and separated by decades, starting from the 1940s to now, the team found an average of eight algorithms per family, of which a couple improved its efficiency. To share this assembled database of knowledge, the team also created Algorithm-Wiki.org.
The scientists charted how quickly these families had improved, focusing on the most-analyzed feature of the algorithms — how fast they could guarantee to solve the problem (in computer speak: “worst-case time complexity”). What emerged was enormous variability, but also important insights on how transformative algorithmic improvement has been for computer science.
For large computing problems, 43 percent of algorithm families had year-on-year improvements that were equal to or larger than the much-touted gains from Moore’s Law. In 14 percent of problems, the improvement to performance from algorithms vastly outpaced those that have come from improved hardware. The gains from algorithm improvement were particularly large for big-data problems, so the importance of those advancements has grown in recent decades.
The single biggest change that the authors observed came when an algorithm family transitioned from exponential to polynomial complexity. The amount of effort it takes to solve an exponential problem is like a person trying to guess a combination on a lock. If you only have a single 10-digit dial, the task is easy. With four dials like a bicycle lock, it’s hard enough that no one steals your bike, but still conceivable that you could try every combination. With 50, it’s almost impossible — it would take too many steps. Problems that have exponential complexity are like that for computers: As they get bigger they quickly outpace the ability of the computer to handle them. Finding a polynomial algorithm often solves that, making it possible to tackle problems in a way that no amount of hardware improvement can.
As rumblings of Moore’s Law coming to an end rapidly permeate global conversations, the researchers say that computing users will increasingly need to turn to areas like algorithms for performance improvements. The team says the findings confirm that historically, the gains from algorithms have been enormous, so the potential is there. But if gains come from algorithms instead of hardware, they’ll look different. Hardware improvement from Moore’s Law happens smoothly over time, and for algorithms the gains come in steps that are usually large but infrequent.
“This is the first paper to show how fast algorithms are improving across a broad range of examples,” says Neil Thompson, an MIT research scientist at CSAIL and the Sloan School of Management and senior author on the new paper. “Through our analysis, we were able to say how many more tasks could be done using the same amount of computing power after an algorithm improved. As problems increase to billions or trillions of data points, algorithmic improvement becomes substantially more important than hardware improvement. In an era where the environmental footprint of computing is increasingly worrisome, this is a way to improve businesses and other organizations without the downside.”
Thompson wrote the paper alongside MIT visiting student Yash Sherry. The paper is published in the Proceedings of the IEEE. The work was funded by the Tides foundation and the MIT Initiative on the Digital Economy. More
88 Shares179 Views
in Data Management & Statistics
Research collaboration puts climate-resilient crops in sight
by Markus Andrews 17 September 2021, 14:00
Any houseplant owner knows that changes in the amount of water or sunlight a plant receives can put it under immense stress. A dying plant brings certain disappointment to anyone with a green thumb.
But for farmers who make their living by successfully growing plants, and whose crops may nourish hundreds or thousands of people, the devastation of failing flora is that much greater. As climate change is poised to cause increasingly unpredictable weather patterns globally, crops may be subject to more extreme environmental conditions like droughts, fluctuating temperatures, floods, and wildfire.
Climate scientists and food systems researchers worry about the stress climate change may put on crops, and on global food security. In an ambitious interdisciplinary project funded by the Abdul Latif Jameel Water and Food Systems Lab (J-WAFS), David Des Marais, the Gale Assistant Professor in the Department of Civil and Environmental Engineering at MIT, and Caroline Uhler, an associate professor in the MIT Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society, are investigating how plant genes communicate with one another under stress. Their research results can be used to breed plants more resilient to climate change.
Crops in trouble
Governing plants’ responses to environmental stress are gene regulatory networks, or GRNs, which guide the development and behaviors of living things. A GRN may be comprised of thousands of genes and proteins that all communicate with one another. GRNs help a particular cell, tissue, or organism respond to environmental changes by signaling certain genes to turn their expression on or off.
Even seemingly minor or short-term changes in weather patterns can have large effects on crop yield and food security. An environmental trigger, like a lack of water during a crucial phase of plant development, can turn a gene on or off, and is likely to affect many others in the GRN. For example, without water, a gene enabling photosynthesis may switch off. This can create a domino effect, where the genes that rely on those regulating photosynthesis are silenced, and the cycle continues. As a result, when photosynthesis is halted, the plant may experience other detrimental side effects, like no longer being able to reproduce or defend against pathogens. The chain reaction could even kill a plant before it has the chance to be revived by a big rain.
Des Marais says he wishes there was a way to stop those genes from completely shutting off in such a situation. To do that, scientists would need to better understand how exactly gene networks respond to different environmental triggers. Bringing light to this molecular process is exactly what he aims to do in this collaborative research effort.
Solving complex problems across disciplines
Despite their crucial importance, GRNs are difficult to study because of how complex and interconnected they are. Usually, to understand how a particular gene is affecting others, biologists must silence one gene and see how the others in the network respond.
For years, scientists have aspired to an algorithm that could synthesize the massive amount of information contained in GRNs to “identify correct regulatory relationships among genes,” according to a 2019 article in the Encyclopedia of Bioinformatics and Computational Biology.
“A GRN can be seen as a large causal network, and understanding the effects that silencing one gene has on all other genes requires understanding the causal relationships among the genes,” says Uhler. “These are exactly the kinds of algorithms my group develops.”
Des Marais and Uhler’s project aims to unravel these complex communication networks and discover how to breed crops that are more resilient to the increased droughts, flooding, and erratic weather patterns that climate change is already causing globally.
In addition to climate change, by 2050, the world will demand 70 percent more food to feed a booming population. “Food systems challenges cannot be addressed individually in disciplinary or topic area silos,” says Greg Sixt, J-WAFS’ research manager for climate and food systems. “They must be addressed in a systems context that reflects the interconnected nature of the food system.”
Des Marais’ background is in biology, and Uhler’s in statistics. “Dave’s project with Caroline was essentially experimental,” says Renee J. Robins, J-WAFS’ executive director. “This kind of exploratory research is exactly what the J-WAFS seed grant program is for.”
Getting inside gene regulatory networks
Des Marais and Uhler’s work begins in a windowless basement on MIT’s campus, where 300 genetically identical Brachypodium distachyon plants grow in large, temperature-controlled chambers. The plant, which contains more than 30,000 genes, is a good model for studying important cereal crops like wheat, barley, maize, and millet. For three weeks, all plants receive the same temperature, humidity, light, and water. Then, half are slowly tapered off water, simulating drought-like conditions.
Six days into the forced drought, the plants are clearly suffering. Des Marais’ PhD student Jie Yun takes tissues from 50 hydrated and 50 dry plants, freezes them in liquid nitrogen to immediately halt metabolic activity, grinds them up into a fine powder, and chemically separates the genetic material. The genes from all 100 samples are then sequenced at a lab across the street.
The team is left with a spreadsheet listing the 30,000 genes found in each of the 100 plants at the moment they were frozen, and how many copies there were. Uhler’s PhD student Anastasiya Belyaeva inputs the massive spreadsheet into the computer program she developed and runs her novel algorithm. Within a few hours, the group can see which genes were most active in one condition over another, how the genes were communicating, and which were causing changes in others.
The methodology captures important subtleties that could allow researchers to eventually alter gene pathways and breed more resilient crops. “When you expose a plant to drought stress, it’s not like there’s some canonical response,” Des Marais says. “There’s lots of things going on. It’s turning this physiologic process up, this one down, this one didn’t exist before, and now suddenly is turned on.”
In addition to Des Marais and Uhler’s research, J-WAFS has funded projects in food and water from researchers in 29 departments across all five MIT schools as well as the MIT Schwarzman College of Computing. J-WAFS seed grants typically fund seven to eight new projects every year.
“The grants are really aimed at catalyzing new ideas, providing the sort of support [for MIT researchers] to be pushing boundaries, and also bringing in faculty who may have some interesting ideas that they haven’t yet applied to water or food concerns,” Robins says. “It’s an avenue for researchers all over the Institute to apply their ideas to water and food.”
Alison Gold is a student in MIT’s Graduate Program in Science Writing. More
150 Shares199 Views
in Data Management & Statistics
MIT appoints members of new faculty committee to drive climate action plan
by Markus Andrews 17 September 2021, 04:00
In May, responding to the world’s accelerating climate crisis, MIT issued an ambitious new plan, “Fast Forward: MIT’s Climate Action Plan for the Decade.” The plan outlines a broad array of new and expanded initiatives across campus to build on the Institute’s longstanding climate work.
Now, to unite these varied climate efforts, maximize their impact, and identify new ways for MIT to contribute climate solutions, the Institute has appointed more than a dozen faculty members to a new committee established by the Fast Forward plan, named the Climate Nucleus.
The committee includes leaders of a number of climate- and energy-focused departments, labs, and centers that have significant responsibilities under the plan. Its membership spans all five schools and the MIT Schwarzman College of Computing. Professors Noelle Selin and Anne White have agreed to co-chair the Climate Nucleus for a term of three years.
“I am thrilled and grateful that Noelle and Anne have agreed to step up to this important task,” says Maria T. Zuber, MIT’s vice president for research. “Under their leadership, I’m confident that the Climate Nucleus will bring new ideas and new energy to making the strategy laid out in the climate action plan a reality.”
The Climate Nucleus has broad responsibility for the management and implementation of the Fast Forward plan across its five areas of action: sparking innovation, educating future generations, informing and leveraging government action, reducing MIT’s own climate impact, and uniting and coordinating all of MIT’s climate efforts.
Over the next few years, the nucleus will aim to advance MIT’s contribution to a two-track approach to decarbonizing the global economy, an approach described in the Fast Forward plan. First, humanity must go as far and as fast as it can to reduce greenhouse gas emissions using existing tools and methods. Second, societies need to invest in, invent, and deploy new tools — and promote new institutions and policies — to get the global economy to net-zero emissions by mid-century.
The co-chairs of the nucleus bring significant climate and energy expertise, along with deep knowledge of the MIT community, to their task.
Selin is a professor with joint appointments in the Institute for Data, Systems, and Society and the Department of Earth, Atmospheric and Planetary Sciences. She is also the director of the Technology and Policy Program. She began at MIT in 2007 as a postdoc with the Center for Global Change Science and the Joint Program on the Science and Policy of Global Change. Her research uses modeling to inform decision-making on air pollution, climate change, and hazardous substances.
“Climate change affects everything we do at MIT. For the new climate action plan to be effective, the Climate Nucleus will need to engage the entire MIT community and beyond, including policymakers as well as people and communities most affected by climate change,” says Selin. “I look forward to helping to guide this effort.”
White is the School of Engineering’s Distinguished Professor of Engineering and the head of the Department of Nuclear Science and Engineering. She joined the MIT faculty in 2009 and has also served as the associate director of MIT’s Plasma Science and Fusion Center. Her research focuses on assessing and refining the mathematical models used in the design of fusion energy devices, such as tokamaks, which hold promise for delivering limitless zero-carbon energy.
“The latest IPCC report underscores the fact that we have no time to lose in decarbonizing the global economy quickly. This is a problem that demands we use every tool in our toolbox — and develop new ones — and we’re committed to doing that,” says White, referring to an August 2021 report from the Intergovernmental Panel on Climate Change, a UN climate science body, that found that climate change has already affected every region on Earth and is intensifying. “We must train future technical and policy leaders, expand opportunities for students to work on climate problems, and weave sustainability into every one of MIT’s activities. I am honored to be a part of helping foster this Institute-wide collaboration.”
A first order of business for the Climate Nucleus will be standing up three working groups to address specific aspects of climate action at MIT: climate education, climate policy, and MIT’s own carbon footprint. The working groups will be responsible for making progress on their particular areas of focus under the plan and will make recommendations to the nucleus on ways of increasing MIT’s effectiveness and impact. The working groups will also include student, staff, and alumni members, so that the entire MIT community has the opportunity to contribute to the plan’s implementation.
The nucleus, in turn, will report and make regular recommendations to the Climate Steering Committee, a senior-level team consisting of Zuber; Richard Lester, the associate provost for international activities; Glen Shor, the executive vice president and treasurer; and the deans of the five schools and the MIT Schwarzman College of Computing. The new plan created the Climate Steering Committee to ensure that climate efforts will receive both the high-level attention and the resources needed to succeed.
Together the new committees and working groups are meant to form a robust new infrastructure for uniting and coordinating MIT’s climate action efforts in order to maximize their impact. They replace the Climate Action Advisory Committee, which was created in 2016 following the release of MIT’s first climate action plan.
In addition to Selin and White, the members of the Climate Nucleus are:
Bob Armstrong, professor in the Department of Chemical Engineering and director of the MIT Energy Initiative;
Dara Entekhabi, professor in the departments of Civil and Environmental Engineering and Earth, Atmospheric and Planetary Sciences;
John Fernández, professor in the Department of Architecture and director of the Environmental Solutions Initiative;
Stefan Helmreich, professor in the Department of Anthropology;
Christopher Knittel, professor in the MIT Sloan School of Management and director of the Center for Energy and Environmental Policy Research;
John Lienhard, professor in the Department of Mechanical Engineering and director of the Abdul Latif Jameel Water and Food Systems Lab;
Julie Newman, director of the Office of Sustainability and lecturer in the Department of Urban Studies and Planning;
Elsa Olivetti, professor in the Department of Materials Science and Engineering and co-director of the Climate and Sustainability Consortium;
Christoph Reinhart, professor in the Department of Architecture and director of the Building Technology Program;
John Sterman, professor in the MIT Sloan School of Management and director of the Sloan Sustainability Initiative;
Rob van der Hilst, professor and head of the Department of Earth, Atmospheric and Planetary Sciences; and
Chris Zegras, professor and head of the Department of Urban Studies and Planning. More
50 Shares169 Views
in Data Management & Statistics
End-to-end supply chain transparency
by Markus Andrews 13 September 2021, 18:20
For years, companies have managed their extended supply chains with intermittent audits and certifications while attempting to persuade their suppliers to adhere to certain standards and codes of conduct. But they’ve lacked the concrete data necessary to prove their supply chains were working as they should. They most likely had baseline data about their suppliers — what they bought and who they bought it from — but knew little else about the rest of the supply chain.
With Sourcemap, companies can now trace their supply chains from raw material to finished good with certainty, keeping track of the mines and farms that produce the commodities they rely on to take their goods to market. This unprecedented level of transparency provides Sourcemap’s customers with the assurance that the entire end-to-end supply chain operates within their standards while living up to social and environmental targets.
And they’re doing it at scale for large multinationals across the food, agricultural, automotive, tech, and apparel industries. Thanks to Sourcemap founder and CEO Leonardo Bonanni MA ’03, SM ’05, PhD ’10, companies like VF Corporation, owner of brands like Timberland, The North Face, Mars, Hershey, and Ferrero, now have enough data to confidently tell the story of how they’re sourcing their raw materials.
“Coming from the Media Lab, we recognized early on the power of the cloud, the power of social networking-type databases and smartphone diffusion around the world,” says Bonanni of his company’s MIT roots. Rather than providing intermittent glances at the supply chain via an auditor, Sourcemap collects data continuously, in real-time, every step of the way, flagging anything that could indicate counterfeiting, adulteration, fraud, waste, or abuse.
“We’ve taken our customers from a situation where they had very little control to a world where they have direct visibility over their entire global operations, even allowing them to see ahead of time — before a container reaches the port — whether there is any indication that there might be something wrong with it,” says Bonanni.
The key problem Sourcemap addresses is a lack of data in companies’ supply chain management databases. According to Bonanni, most Sourcemap customers have invested millions of dollars in enterprise resource planning (ERP) databases, which provide information about internal operations and direct suppliers, but fall short when it comes to global operations, where their secondary and tertiary suppliers operate. Built on relational databases, ERP systems have been around for more than 40 years and work well for simple, static data structures. But they aren’t agile enough to handle big data and rapidly evolving, complex data structures
Sourcemap, on the other hand, uses NoSQL (non-relational) database technology, which is more flexible, cost-efficient, and scalable. “Our platform is like a LinkedIn for the supply chain,” explains Bonanni. Customers provide information about where they buy their raw materials, the suppliers get invited to the network and provide information to validate those relationships, right down to the farms and the mines where the raw materials are extracted — which is often where the biggest risks lie.
Initially, the entire supply chain database of a Sourcemap customer might amount to a few megabytes of spreadsheets listing their purchase orders and the names of their suppliers. Sourcemap delivers terabytes of data that paint a detailed picture of the supply chain, capturing everything, right down to the moment a farmer in West Africa delivers cocoa beans to a warehouse, onto a truck heading to a port, to a factory, all the way to the finished goods.
“We’ve seen the amount of data collected grow by a factor of 1 million, which tells us that the world is finally ready for full visibility of supply chains,” says Bonanni. “The fact is that we’ve seen supply chain transparency go from a fringe concern to a broad-based requirement as a license to operate in most of Europe and North America,” says Bonanni.
These days, disruptions in supply chains, combined with price volatility and new laws requiring companies to prove that the goods they import were not made illegally (such as by causing deforestation or involving forced or child labor), means that companies are often required to know where they source their raw materials from, even if they only import the materials through an intermediary.
Sourcemap uses its full suite of tools to walk customers through a step-by-step process that maps their suppliers while measuring performance, ultimately verifying the entire supply chain and providing them with the confidence to import goods while being customs-compliant. At the end of the day, Sourcemap customers can communicate to their stakeholders and the end consumer exactly where their commodities come from while ensuring that social, environmental, and compliance standards are met.
The company was recently named to the newest cohort of firms honored by the MIT Startup Exchange (STEX) as STEX25 startups. Bonanni is quick to point out the benefits of STEX and of MIT’s Industrial Liaison Program (ILP): “Our best feedback and our most constructive relationships have been with companies that sponsored our research early on at the Media Lab and ILP,” he says. “The innovative exchange of ideas inherent in the MIT startup ecosystem has helped to build up Sourcemap as a company and to grow supply chain transparency as a future-facing technology that more and more companies are now scrambling to adopt.” More
163 Shares99 Views
in Data Management & Statistics
A universal system for decoding any type of data sent across a network
by Markus Andrews 9 September 2021, 04:00
Every piece of data that travels over the internet — from paragraphs in an email to 3D graphics in a virtual reality environment — can be altered by the noise it encounters along the way, such as electromagnetic interference from a microwave or Bluetooth device. The data are coded so that when they arrive at their destination, a decoding algorithm can undo the negative effects of that noise and retrieve the original data.
Since the 1950s, most error-correcting codes and decoding algorithms have been designed together. Each code had a structure that corresponded with a particular, highly complex decoding algorithm, which often required the use of dedicated hardware.
Researchers at MIT, Boston University, and Maynooth University in Ireland have now created the first silicon chip that is able to decode any code, regardless of its structure, with maximum accuracy, using a universal decoding algorithm called Guessing Random Additive Noise Decoding (GRAND). By eliminating the need for multiple, computationally complex decoders, GRAND enables increased efficiency that could have applications in augmented and virtual reality, gaming, 5G networks, and connected devices that rely on processing a high volume of data with minimal delay.
The research at MIT is led by Muriel Médard, the Cecil H. and Ida Green Professor in the Department of Electrical Engineering and Computer Science, and was co-authored by Amit Solomon and Wei Ann, both graduate students at MIT; Rabia Tugce Yazicigil, assistant professor of electrical and computer engineering at Boston University; Arslan Riaz and Vaibhav Bansal, both graduate students at Boston University; Ken R. Duffy, director of the Hamilton Institute at the National University of Ireland at Maynooth; and Kevin Galligan, a Maynooth graduate student. The research will be presented at the European Solid-States Device Research and Circuits Conference next week.
Focus on noise
One way to think of these codes is as redundant hashes (in this case, a series of 1s and 0s) added to the end of the original data. The rules for the creation of that hash are stored in a specific codebook.
As the encoded data travel over a network, they are affected by noise, or energy that disrupts the signal, which is often generated by other electronic devices. When that coded data and the noise that affected them arrive at their destination, the decoding algorithm consults its codebook and uses the structure of the hash to guess what the stored information is.
Instead, GRAND works by guessing the noise that affected the message, and uses the noise pattern to deduce the original information. GRAND generates a series of noise sequences in the order they are likely to occur, subtracts them from the received data, and checks to see if the resulting codeword is in a codebook.
While the noise appears random in nature, it has a probabilistic structure that allows the algorithm to guess what it might be.
“In a way, it is similar to troubleshooting. If someone brings their car into the shop, the mechanic doesn’t start by mapping the entire car to blueprints. Instead, they start by asking, ‘What is the most likely thing to go wrong?’ Maybe it just needs gas. If that doesn’t work, what’s next? Maybe the battery is dead?” Médard says.
Novel hardware
The GRAND chip uses a three-tiered structure, starting with the simplest possible solutions in the first stage and working up to longer and more complex noise patterns in the two subsequent stages. Each stage operates independently, which increases the throughput of the system and saves power.
The device is also designed to switch seamlessly between two codebooks. It contains two static random-access memory chips, one that can crack codewords, while the other loads a new codebook and then switches to decoding without any downtime.
The researchers tested the GRAND chip and found it could effectively decode any moderate redundancy code up to 128 bits in length, with only about a microsecond of latency.
Médard and her collaborators had previously demonstrated the success of the algorithm, but this new work showcases the effectiveness and efficiency of GRAND in hardware for the first time.
Developing hardware for the novel decoding algorithm required the researchers to first toss aside their preconceived notions, Médard says.
“We couldn’t go out and reuse things that had already been done. This was like a complete whiteboard. We had to really think about every single component from scratch. It was a journey of reconsideration. And I think when we do our next chip, there will be things with this first chip that we’ll realize we did out of habit or assumption that we can do better,” she says.
A chip for the future
Since GRAND only uses codebooks for verification, the chip not only works with legacy codes but could also be used with codes that haven’t even been introduced yet.
In the lead-up to 5G implementation, regulators and communications companies struggled to find consensus as to which codes should be used in the new network. Regulators ultimately chose to use two types of traditional codes for 5G infrastructure in different situations. Using GRAND could eliminate the need for that rigid standardization in the future, Médard says.
The GRAND chip could even open the field of coding to a wave of innovation.
“For reasons I’m not quite sure of, people approach coding with awe, like it is black magic. The process is mathematically nasty, so people just use codes that already exist. I’m hoping this will recast the discussion so it is not so standards-oriented, enabling people to use codes that already exist and create new codes,” she says.
Moving forward, Médard and her collaborators plan to tackle the problem of soft detection with a retooled version of the GRAND chip. In soft detection, the received data are less precise.
They also plan to test the ability of GRAND to crack longer, more complex codes and adjust the structure of the silicon chip to improve its energy efficiency.
The research was funded by the Battelle Memorial Institute and Science Foundation of Ireland. More
125 Shares179 Views
in Data Management & Statistics
MIT welcomes nine MLK Visiting Professors and Scholars for 2021-22
by Markus Andrews 8 September 2021, 20:00
In its 31st year, the Martin Luther King Jr. (MLK) Visiting Professors and Scholars Program will host nine outstanding scholars from across the Americas. The flagship program honors the life and legacy of Martin Luther King Jr. by increasing the presence and recognizing the contributions of underrepresented minority scholars at MIT. Throughout the year, the cohort will enhance their scholarship through intellectual engagement with the MIT community and enrich the cultural, academic, and professional experience of students.
The 2021-22 scholars
Sanford Biggers is an interdisciplinary artist hosted by the Department of Architecture. His work is an interplay of narrative, perspective, and history that speaks to current social, political, and economic happenings while examining their contexts. His diverse practice positions him as a collaborator with the past through explorations of often-overlooked cultural and political narratives from American history. Through collaboration with his faculty host, Brandon Clifford, he will spend the year contributing to projects with Architecture; Art, Culture and Technology; the Transmedia Storytelling initiatives; and community workshops and engagement with local K-12 education.
Kristen Dorsey is an assistant professor of engineering at Smith College. She will be hosted by the Program in Media Arts and Sciences at the MIT Media Lab. Her research focuses on the fabrication and characterization of microscale sensors and microelectromechanical systems. Dorsey tries to understand “why things go wrong” by investigating device reliability and stability. At MIT, Dorsey is interested in forging collaborations to consider issues of access and equity as they apply to wearable health care devices.
Omolola “Lola” Eniola-Adefeso is the associate dean for graduate and professional education and associate professor of chemical engineering at the University of Michigan. She will join MIT’s Department of Chemical Engineering (ChemE). Eniola-Adefeso will work with Professor Paula Hammond on developing electrostatically assembled nanoparticle coatings that enable targeting of specific immune cell types. A co-founder and chief scientific officer of Asalyxa Bio, she is interested in the interactions between blood leukocytes and endothelial cells in vessel lumen lining, and how they change during inflammation response. Eniola-Adefeso will also work with the Diversity in Chemical Engineering (DICE) graduate student group in ChemE and the National Organization of Black Chemists and Chemical Engineers.
Robert Gilliard Jr. is an assistant professor of chemistry at the University of Virginia and will join the MIT chemistry department, working closely with faculty host Christopher Cummins. His research focuses on various aspects of group 15 element chemistry. He was a founding member of the National Organization of Black Chemists and Chemical Engineers UGA section, and he has served as an American Chemical Society (ACS) Bridge Program mentor as well as an ACS Project Seed mentor. Gilliard has also collaborated with the Cleveland Public Library to expose diverse young scholars to STEM fields.
Valencia Joyner Koomson ’98, MNG ’99 will return for the second semester of her appointment this fall in MIT’s Department of Electrical Engineering and Computer Science. Based at Tufts University, where she is an associate professor in the Department of Electrical and Computer Engineering, Koomson has focused her research on microelectronic systems for cell analysis and biomedical applications. In the past semester, she has served as a judge for the Black Alumni/ae of MIT Research Slam and worked closely with faculty host Professor Akintunde Akinwande.
Luis Gilberto Murillo-Urrutia will continue his appointment in MIT’s Environmental Solutions Initiative. He has 30 years of experience in public policy design, implementation, and advocacy, most notably in the areas of sustainable regional development, environmental protection and management of natural resources, social inclusion, and peace building. At MIT, he has continued his research on environmental justice, with a focus on carbon policy and its impacts on Afro-descendant communities in Colombia.
Sonya T. Smith was the first female professor of mechanical engineering at Howard University. She will join the Department of Aeronautics and Astronautics at MIT. Her research involves computational fluid dynamics and thermal management of electronics for air and space vehicles. She is looking forward to serving as a mentor to underrepresented students across MIT and fostering new research collaborations with her home lab at Howard.
Lawrence Udeigwe is an associate professor of mathematics at Manhattan College and will join MIT’s Department of Brain and Cognitive Sciences. He plans to co-teach a graduate seminar course with Professor James DiCarlo to explore practical and philosophical questions regarding the use of simulations to build theories in neuroscience. Udeigwe also leads the Lorens Chuno group; as a singer-songwriter, his work tackles intersectionality issues faced by contemporary Africans.
S. Craig Watkins is an internationally recognized expert in media and a professor at the University of Texas at Austin. He will join MIT’s Institute for Data, Systems, and Society to assist in researching the role of big data in enabling deep structural changes with regard to systemic racism. He will continue to expand on his work as founding director of the Institute for Media Innovation at the University of Texas at Austin, exploring the intersections of critical AI studies, critical race studies, and design. He will also work with MIT’s Center for Advanced Virtuality to develop computational systems that support social perspective-taking.
Community engagement
Throughout the 2021-22 academic year, MLK professors and scholars will be presenting their research at a monthly speaker series. Events will be held in an in-person/Zoom hybrid environment. All members of the MIT community are encouraged to attend and hear directly from this year’s cohort of outstanding scholars. To hear more about upcoming events, subscribe to their mailing list.
On Sept. 15, all are invited to join the Institute Community and Equity Office in welcoming the scholars to campus by attending a welcome luncheon. More
100 Shares149 Views
in Data Management & Statistics
Using adversarial attacks to refine molecular energy predictions
by Markus Andrews 1 September 2021, 20:50
Neural networks (NNs) are increasingly being used to predict new materials, the rate and yield of chemical reactions, and drug-target interactions, among others. For these applications, they are orders of magnitude faster than traditional methods such as quantum mechanical simulations.
The price for this agility, however, is reliability. Because machine learning models only interpolate, they may fail when used outside the domain of training data.
But the part that worried Rafael Gómez-Bombarelli, the Jeffrey Cheah Career Development Professor in the MIT Department of Materials Science and Engineering, and graduate students Daniel Schwalbe-Koda and Aik Rui Tan was that establishing the limits of these machine learning (ML) models is tedious and labor-intensive.
This is particularly true for predicting ‘‘potential energy surfaces” (PES), or the map of a molecule’s energy in all its configurations. These surfaces encode the complexities of a molecule into flatlands, valleys, peaks, troughs, and ravines. The most stable configurations of a system are usually in the deep pits — quantum mechanical chasms from which atoms and molecules typically do not escape.
In a recent Nature Communications paper, the research team presented a way to demarcate the “safe zone” of a neural network by using “adversarial attacks.” Adversarial attacks have been studied for other classes of problems, such as image classification, but this is the first time that they are being used to sample molecular geometries in a PES.
“People have been using uncertainty for active learning for years in ML potentials. The key difference is that they need to run the full ML simulation and evaluate if the NN was reliable, and if it wasn’t, acquire more data, retrain and re-simulate. Meaning that it takes a long time to nail down the right model, and one has to run the ML simulation many times” explains Gómez-Bombarelli.
The Gómez-Bombarelli lab at MIT works on a synergistic synthesis of first-principles simulation and machine learning that greatly speeds up this process. The actual simulations are run only for a small fraction of these molecules, and all those data are fed into a neural network that learns how to predict the same properties for the rest of the molecules. They have successfully demonstrated these methods for a growing class of novel materials that includes catalysts for producing hydrogen from water, cheaper polymer electrolytes for electric vehicles, zeolites for molecular sieving, magnetic materials, and more.
The challenge, however, is that these neural networks are only as smart as the data they are trained on. Considering the PES map, 99 percent of the data may fall into one pit, totally missing valleys that are of more interest.
Such wrong predictions can have disastrous consequences — think of a self-driving car that fails to identify a person crossing the street.
One way to find out the uncertainty of a model is to run the same data through multiple versions of it.
For this project, the researchers had multiple neural networks predict the potential energy surface from the same data. Where the network is fairly sure of the prediction, the variation between the outputs of different networks is minimal and the surfaces largely converge. When the network is uncertain, the predictions of different models vary widely, producing a range of outputs, any of which could be the correct surface.
The spread in the predictions of a “committee of neural networks” is the “uncertainty” at that point. A good model should not just indicate the best prediction, but also indicates the uncertainty about each of these predictions. It’s like the neural network says “this property for material A will have a value of X and I’m highly confident about it.”
This could have been an elegant solution but for the sheer scale of the combinatorial space. “Each simulation (which is ground feed for the neural network) may take from tens to thousands of CPU hours,” explains Schwalbe-Koda. For the results to be meaningful, multiple models must be run over a sufficient number of points in the PES, an extremely time-consuming process.
Instead, the new approach only samples data points from regions of low prediction confidence, corresponding to specific geometries of a molecule. These molecules are then stretched or deformed slightly so that the uncertainty of the neural network committee is maximized. Additional data are computed for these molecules through simulations and then added to the initial training pool.
The neural networks are trained again, and a new set of uncertainties are calculated. This process is repeated until the uncertainty associated with various points on the surface becomes well-defined and cannot be decreased any further.
Gómez-Bombarelli explains, “We aspire to have a model that is perfect in the regions we care about (i.e., the ones that the simulation will visit) without having had to run the full ML simulation, by making sure that we make it very good in high-likelihood regions where it isn’t.”
The paper presents several examples of this approach, including predicting complex supramolecular interactions in zeolites. These materials are cavernous crystals that act as molecular sieves with high shape selectivity. They find applications in catalysis, gas separation, and ion exchange, among others.
Because performing simulations of large zeolite structures is very costly, the researchers show how their method can provide significant savings in computational simulations. They used more than 15,000 examples to train a neural network to predict the potential energy surfaces for these systems. Despite the large cost required to generate the dataset, the final results are mediocre, with only around 80 percent of the neural network-based simulations being successful. To improve the performance of the model using traditional active learning methods, the researchers calculated an additional 5,000 data points, which improved the performance of the neural network potentials to 92 percent.
However, when the adversarial approach is used to retrain the neural networks, the authors saw a performance jump to 97 percent using only 500 extra points. That’s a remarkable result, the researchers say, especially considering that each of these extra points takes hundreds of CPU hours.
This could be the most realistic method to probe the limits of models that researchers use to predict the behavior of materials and the progress of chemical reactions. More
113 Shares149 Views
in Data Management & Statistics
Making the case for hydrogen in a zero-carbon economy
by Markus Andrews 31 August 2021, 20:15
As the United States races to achieve its goal of zero-carbon electricity generation by 2035, energy providers are swiftly ramping up renewable resources such as solar and wind. But because these technologies churn out electrons only when the sun shines and the wind blows, they need backup from other energy sources, especially during seasons of high electric demand. Currently, plants burning fossil fuels, primarily natural gas, fill in the gaps.
“As we move to more and more renewable penetration, this intermittency will make a greater impact on the electric power system,” says Emre Gençer, a research scientist at the MIT Energy Initiative (MITEI). That’s because grid operators will increasingly resort to fossil-fuel-based “peaker” plants that compensate for the intermittency of the variable renewable energy (VRE) sources of sun and wind. “If we’re to achieve zero-carbon electricity, we must replace all greenhouse gas-emitting sources,” Gençer says.
Low- and zero-carbon alternatives to greenhouse-gas emitting peaker plants are in development, such as arrays of lithium-ion batteries and hydrogen power generation. But each of these evolving technologies comes with its own set of advantages and constraints, and it has proven difficult to frame the debate about these options in a way that’s useful for policymakers, investors, and utilities engaged in the clean energy transition.
Now, Gençer and Drake D. Hernandez SM ’21 have come up with a model that makes it possible to pin down the pros and cons of these peaker-plant alternatives with greater precision. Their hybrid technological and economic analysis, based on a detailed inventory of California’s power system, was published online last month in Applied Energy. While their work focuses on the most cost-effective solutions for replacing peaker power plants, it also contains insights intended to contribute to the larger conversation about transforming energy systems.
“Our study’s essential takeaway is that hydrogen-fired power generation can be the more economical option when compared to lithium-ion batteries — even today, when the costs of hydrogen production, transmission, and storage are very high,” says Hernandez, who worked on the study while a graduate research assistant for MITEI. Adds Gençer, “If there is a place for hydrogen in the cases we analyzed, that suggests there is a promising role for hydrogen to play in the energy transition.”
Adding up the costs
California serves as a stellar paradigm for a swiftly shifting power system. The state draws more than 20 percent of its electricity from solar and approximately 7 percent from wind, with more VRE coming online rapidly. This means its peaker plants already play a pivotal role, coming online each evening when the sun goes down or when events such as heat waves drive up electricity use for days at a time.
“We looked at all the peaker plants in California,” recounts Gençer. “We wanted to know the cost of electricity if we replaced them with hydrogen-fired turbines or with lithium-ion batteries.” The researchers used a core metric called the levelized cost of electricity (LCOE) as a way of comparing the costs of different technologies to each other. LCOE measures the average total cost of building and operating a particular energy-generating asset per unit of total electricity generated over the hypothetical lifetime of that asset.
Selecting 2019 as their base study year, the team looked at the costs of running natural gas-fired peaker plants, which they defined as plants operating 15 percent of the year in response to gaps in intermittent renewable electricity. In addition, they determined the amount of carbon dioxide released by these plants and the expense of abating these emissions. Much of this information was publicly available.
Coming up with prices for replacing peaker plants with massive arrays of lithium-ion batteries was also relatively straightforward: “There are no technical limitations to lithium-ion, so you can build as many as you want; but they are super expensive in terms of their footprint for energy storage and the mining required to manufacture them,” says Gençer.
But then came the hard part: nailing down the costs of hydrogen-fired electricity generation. “The most difficult thing is finding cost assumptions for new technologies,” says Hernandez. “You can’t do this through a literature review, so we had many conversations with equipment manufacturers and plant operators.”
The team considered two different forms of hydrogen fuel to replace natural gas, one produced through electrolyzer facilities that convert water and electricity into hydrogen, and another that reforms natural gas, yielding hydrogen and carbon waste that can be captured to reduce emissions. They also ran the numbers on retrofitting natural gas plants to burn hydrogen as opposed to building entirely new facilities. Their model includes identification of likely locations throughout the state and expenses involved in constructing these facilities.
The researchers spent months compiling a giant dataset before setting out on the task of analysis. The results from their modeling were clear: “Hydrogen can be a more cost-effective alternative to lithium-ion batteries for peaking operations on a power grid,” says Hernandez. In addition, notes Gençer, “While certain technologies worked better in particular locations, we found that on average, reforming hydrogen rather than electrolytic hydrogen turned out to be the cheapest option for replacing peaker plants.”
A tool for energy investors
When he began this project, Gençer admits he “wasn’t hopeful” about hydrogen replacing natural gas in peaker plants. “It was kind of shocking to see in our different scenarios that there was a place for hydrogen.” That’s because the overall price tag for converting a fossil-fuel based plant to one based on hydrogen is very high, and such conversions likely won’t take place until more sectors of the economy embrace hydrogen, whether as a fuel for transportation or for varied manufacturing and industrial purposes.
A nascent hydrogen production infrastructure does exist, mainly in the production of ammonia for fertilizer. But enormous investments will be necessary to expand this framework to meet grid-scale needs, driven by purposeful incentives. “With any of the climate solutions proposed today, we will need a carbon tax or carbon pricing; otherwise nobody will switch to new technologies,” says Gençer.
The researchers believe studies like theirs could help key energy stakeholders make better-informed decisions. To that end, they have integrated their analysis into SESAME, a life cycle and techno-economic assessment tool for a range of energy systems that was developed by MIT researchers. Users can leverage this sophisticated modeling environment to compare costs of energy storage and emissions from different technologies, for instance, or to determine whether it is cost-efficient to replace a natural gas-powered plant with one powered by hydrogen.
“As utilities, industry, and investors look to decarbonize and achieve zero-emissions targets, they have to weigh the costs of investing in low-carbon technologies today against the potential impacts of climate change moving forward,” says Hernandez, who is currently a senior associate in the energy practice at Charles River Associates. Hydrogen, he believes, will become increasingly cost-competitive as its production costs decline and markets expand.
A study group member of MITEI’s soon-to-be published Future of Storage study, Gençer knows that hydrogen alone will not usher in a zero-carbon future. But, he says, “Our research shows we need to seriously consider hydrogen in the energy transition, start thinking about key areas where hydrogen should be used, and start making the massive investments necessary.”
Funding for this research was provided by MITEI’s Low-Carbon Energy Centers and Future of Storage study. More

This portal is not a newspaper as it is updated without periodicity. It cannot be considered an editorial product pursuant to law n. 62 of 7.03.2001. The author of the portal is not responsible for the content of comments to posts, the content of the linked sites. Some texts or images included in this portal are taken from the internet and, therefore, considered to be in the public domain; if their publication is violated, the copyright will be promptly communicated via e-mail. They will be immediately removed.