More stories

  • in

    When it comes to AI, can we ditch the datasets?

    Huge amounts of data are needed to train machine-learning models to perform image classification tasks, such as identifying damage in satellite photos following a natural disaster. However, these data are not always easy to come by. Datasets may cost millions of dollars to generate, if usable data exist in the first place, and even the best datasets often contain biases that negatively impact a model’s performance.

    To circumvent some of the problems presented by datasets, MIT researchers developed a method for training a machine learning model that, rather than using a dataset, uses a special type of machine-learning model to generate extremely realistic synthetic data that can train another model for downstream vision tasks.

    Their results show that a contrastive representation learning model trained using only these synthetic data is able to learn visual representations that rival or even outperform those learned from real data.

    This special machine-learning model, known as a generative model, requires far less memory to store or share than a dataset. Using synthetic data also has the potential to sidestep some concerns around privacy and usage rights that limit how some real data can be distributed. A generative model could also be edited to remove certain attributes, like race or gender, which could address some biases that exist in traditional datasets.

    “We knew that this method should eventually work; we just needed to wait for these generative models to get better and better. But we were especially pleased when we showed that this method sometimes does even better than the real thing,” says Ali Jahanian, a research scientist in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper.

    Jahanian wrote the paper with CSAIL grad students Xavier Puig and Yonglong Tian, and senior author Phillip Isola, an assistant professor in the Department of Electrical Engineering and Computer Science. The research will be presented at the International Conference on Learning Representations.

    Generating synthetic data

    Once a generative model has been trained on real data, it can generate synthetic data that are so realistic they are nearly indistinguishable from the real thing. The training process involves showing the generative model millions of images that contain objects in a particular class (like cars or cats), and then it learns what a car or cat looks like so it can generate similar objects.

    Essentially by flipping a switch, researchers can use a pretrained generative model to output a steady stream of unique, realistic images that are based on those in the model’s training dataset, Jahanian says.

    But generative models are even more useful because they learn how to transform the underlying data on which they are trained, he says. If the model is trained on images of cars, it can “imagine” how a car would look in different situations — situations it did not see during training — and then output images that show the car in unique poses, colors, or sizes.

    Having multiple views of the same image is important for a technique called contrastive learning, where a machine-learning model is shown many unlabeled images to learn which pairs are similar or different.

    The researchers connected a pretrained generative model to a contrastive learning model in a way that allowed the two models to work together automatically. The contrastive learner could tell the generative model to produce different views of an object, and then learn to identify that object from multiple angles, Jahanian explains.

    “This was like connecting two building blocks. Because the generative model can give us different views of the same thing, it can help the contrastive method to learn better representations,” he says.

    Even better than the real thing

    The researchers compared their method to several other image classification models that were trained using real data and found that their method performed as well, and sometimes better, than the other models.

    One advantage of using a generative model is that it can, in theory, create an infinite number of samples. So, the researchers also studied how the number of samples influenced the model’s performance. They found that, in some instances, generating larger numbers of unique samples led to additional improvements.

    “The cool thing about these generative models is that someone else trained them for you. You can find them in online repositories, so everyone can use them. And you don’t need to intervene in the model to get good representations,” Jahanian says.

    But he cautions that there are some limitations to using generative models. In some cases, these models can reveal source data, which can pose privacy risks, and they could amplify biases in the datasets they are trained on if they aren’t properly audited.

    He and his collaborators plan to address those limitations in future work. Another area they want to explore is using this technique to generate corner cases that could improve machine learning models. Corner cases often can’t be learned from real data. For instance, if researchers are training a computer vision model for a self-driving car, real data wouldn’t contain examples of a dog and his owner running down a highway, so the model would never learn what to do in this situation. Generating that corner case data synthetically could improve the performance of machine learning models in some high-stakes situations.

    The researchers also want to continue improving generative models so they can compose images that are even more sophisticated, he says.

    This research was supported, in part, by the MIT-IBM Watson AI Lab, the United States Air Force Research Laboratory, and the United States Air Force Artificial Intelligence Accelerator. More

  • in

    Transforming the travel experience for the Hong Kong airport

    MIT Hong Kong Innovation Node welcomed 33 students to its flagship program, MIT Entrepreneurship and Maker Skills Integrator (MEMSI). Designed to develop entrepreneurial prowess through exposure to industry-driven challenges, MIT students joined forces with Hong Kong peers in this two-week hybrid bootcamp, developing unique proposals for the Airport Authority of Hong Kong.

    Many airports across the world continue to be affected by the broader impact of Covid-19 with reduced air travel, prompting airlines to cut capacity. The result is a need for new business opportunities to propel economic development. For Hong Kong, the expansion toward non-aeronautical activities to boost regional consumption is therefore crucial, and included as part of the blueprint to transform the city’s airport into an airport city — characterized by capacity expansion, commercial developments, air cargo leadership, an autonomous transport system, connectivity to neighboring cities in mainland China, and evolution into a smart airport guided by sustainable practices. To enhance the customer experience, a key focus is capturing business opportunities at the nexus of digital and physical interactions. 

    These challenges “bring ideas and talent together to tackle real-world problems in the areas of digital service creation for the airport and engaging regional customers to experience the new airport city,” says Charles Sodini, the LeBel Professor of Electrical Engineering at MIT and faculty director at the Node. 

    The new travel standard

    Businesses are exploring new digital technologies, both to drive bookings and to facilitate safe travel. Developments such as Hong Kong airport’s Flight Token, a biometric technology using facial recognition to enable contactless check-ins and boarding at airports, unlock enormous potential that speeds up the departure journey of passengers. Seamless virtual experiences are not going to disappear.

    “What we may see could be a strong rebounce especially for travelers after the travel ban lifts … an opportunity to make travel easier, flying as simple as riding the bus,” says Chris Au Young, general manager of smart airport and general manager of data analytics at the Airport Authority of Hong Kong. 

    The passenger experience of the future will be “enabled by mobile technology, internet of things, and digital platforms,” he explains, adding that in the aviation community, “international organizations have already stipulated that biometric technology will be the new standard for the future … the next question is how this can be connected across airports.”  

    This extends further beyond travel, where Au Young illustrates, “If you go to a concert at Asia World Expo, which is the airport’s new arena in the future, you might just simply show your face rather than queue up in a long line waiting to show your tickets.”

    Accelerating the learning curve with industry support

    Working closely with industry mentors involved in the airport city’s development, students dived deep into discussions on the future of adapted travel, interviewed and surveyed travelers, and plowed through a range of airport data to uncover business insights.

    “With the large amount of data provided, my teammates and I worked hard to identify modeling opportunities that were both theoretically feasible and valuable in a business sense,” says Sean Mann, a junior at MIT studying computer science.

    Mann and his team applied geolocation data to inform machine learning predictions on a passenger’s journey once they enter the airside area. Coupled with biometric technology, passengers can receive personalized recommendations with improved accuracy via the airport’s bespoke passenger app, powered by data collected through thousands of iBeacons dispersed across the vicinity. Armed with these insights, the aim is to enhance the user experience by driving meaningful footfall to retail shops, restaurants, and other airport amenities.

    The support of industry partners inspired his team “with their deep understanding of the aviation industry,” he added. “In a short period of two weeks, we built a proof-of-concept and a rudimentary business plan — the latter of which was very new to me.”

    Collaborating across time zones, Rumen Dangovski, a PhD candidate in electrical engineering and computer science at MIT, joined MEMSI from his home in Bulgaria. For him, learning “how to continually revisit ideas to discover important problems and meaningful solutions for a large and complex real-world system” was a key takeaway. The iterative process helped his team overcome the obstacle of narrowing down the scope of their proposal, with the help of industry mentors and advisors. 

    “Without the feedback from industry partners, we would not have been able to formulate a concrete solution that is actually helpful to the airport,” says Dangovski.  

    Beyond valuable mentorship, he adds, “there was incredible energy in our team, consisting of diverse talent, grit, discipline and organization. I was positively surprised how MEMSI can form quickly and give continual support to our team. The overall experience was very fun.“

    A sustainable future

    Mrigi Munjal, a PhD candidate studying materials science and engineering at MIT, had just taken a long-haul flight from Boston to Delhi prior to the program, and “was beginning to fully appreciate the scale of carbon emissions from aviation.” For her, “that one journey basically overshadowed all of my conscious pro-sustainability lifestyle changes,” she says.

    Knowing that international flights constitute the largest part of an individual’s carbon footprint, Munjal and her team wanted “to make flying more sustainable with an idea that is economically viable for all of the stakeholders involved.” 

    They proposed a carbon offset API that integrates into an airline’s ticket payment system, empowering individuals to take action to offset their carbon footprint, track their personal carbon history, and pick and monitor green projects. The advocacy extends to a digital display of interactive art featured in physical installations across the airport city. The intent is to raise community awareness about one’s impact on the environment and making carbon offsetting accessible. 

    Shaping the travel narrative

    Six teams of students created innovative solutions for the Hong Kong airport which they presented in hybrid format to a panel of judges on Showcase Day. The diverse ideas included an app-based airport retail recommendations supported by iBeacons; a platform that empowers customers to offset their carbon footprint; an app that connects fellow travelers for social and incentive-driven retail experiences; a travel membership exchange platform offering added flexibility to earn and redeem loyalty rewards; an interactive and gamified location-based retail experience using augmented reality; and a digital companion avatar to increase adoption of the airport’s Flight Token and improve airside passenger experience.

    Among the judges was Julian Lee ’97, former president of the MIT Club of Hong Kong and current executive director of finance at the Airport Authority of Hong Kong, who commended the students for demonstrably having “worked very thoroughly and thinking through the specific challenges,” addressing the real pain points that the airport is experiencing.

    “The ideas were very thoughtful and very unique to us. Some of you defined transit passengers as a sub-segment of the market that works. It only happens at the airport and you’ve been able to leverage this transit time in between,” remarked Lee. 

    Strong solutions include an implementation plan to see a path for execution and a viable future. Among the solutions proposed, Au Young was impressed by teams for “paying a lot of attention to the business model … a very important aspect in all the ideas generated.”  

    Addressing the students, Au Young says, “What we love is the way you reinvent the airport business and partnerships, presenting a new way of attracting people to engage more in new services and experiences — not just returning for a flight or just shopping with us, but innovating beyond the airport and using emerging technologies, using location data, using the retailer’s capability and adding some social activities in your solutions.”

    Despite today’s rapidly evolving travel industry, what remains unchanged is a focus on the customer. In the end, “it’s still about the passengers,” added Au Young.  More

  • in

    An “oracle” for predicting the evolution of gene regulation

    Despite the sheer number of genes that each human cell contains, these so-called “coding” DNA sequences comprise just 1 percent of our entire genome. The remaining 99 percent is made up of “non-coding” DNA — which, unlike coding DNA, does not carry the instructions to build proteins.

    One vital function of this non-coding DNA, also called “regulatory” DNA, is to help turn genes on and off, controlling how much (if any) of a protein is made. Over time, as cells replicate their DNA to grow and divide, mutations often crop up in these non-coding regions — sometimes tweaking their function and changing the way they control gene expression. Many of these mutations are trivial, and some are even beneficial. Occasionally, though, they can be associated with increased risk of common diseases, such as Type 2 diabetes, or more life-threatening ones, including cancer.

    To better understand the repercussions of such mutations, researchers have been hard at work on mathematical maps that allow them to look at an organism’s genome, predict which genes will be expressed, and determine how that expression will affect the organism’s observable traits. These maps, called fitness landscapes, were conceptualized roughly a century ago to understand how genetic makeup influences one common measure of organismal fitness in particular: reproductive success. Early fitness landscapes were very simple, often focusing on a limited number of mutations. Much richer datasets are now available, but researchers still require additional tools to characterize and visualize such complex data. This ability would not only facilitate a better understanding of how individual genes have evolved over time, but would also help to predict what sequence and expression changes might occur in the future.

    In a new study published on March 9 in Nature, a team of scientists has developed a framework for studying the fitness landscapes of regulatory DNA. They created a neural network model that, when trained on hundreds of millions of experimental measurements, was capable of predicting how changes to these non-coding sequences in yeast affected gene expression. They also devised a unique way of representing the landscapes in two dimensions, making it easy to understand the past and forecast the future evolution of non-coding sequences in organisms beyond yeast — and even design custom gene expression patterns for gene therapies and industrial applications.

    “We now have an ‘oracle’ that can be queried to ask: What if we tried all possible mutations of this sequence? Or, what new sequence should we design to give us a desired expression?” says Aviv Regev, a professor of biology at MIT (on leave), core member of the Broad Institute of Harvard and MIT (on leave), head of Genentech Research and Early Development, and the study’s senior author. “Scientists can now use the model for their own evolutionary question or scenario, and for other problems like making sequences that control gene expression in desired ways. I am also excited about the possibilities for machine learning researchers interested in interpretability; they can ask their questions in reverse, to better understand the underlying biology.”

    Prior to this study, many researchers had simply trained their models on known mutations (or slight variations thereof) that exist in nature. However, Regev’s team wanted to go a step further by creating their own unbiased models capable of predicting an organism’s fitness and gene expression based on any possible DNA sequence — even sequences they’d never seen before. This would also enable researchers to use such models to engineer cells for pharmaceutical purposes, including new treatments for cancer and autoimmune disorders.

    To accomplish this goal, Eeshit Dhaval Vaishnav, a graduate student at MIT and co-first author; Carl de Boer, now an assistant professor at the University of British Columbia; and their colleagues created a neural network model to predict gene expression. They trained it on a dataset generated by inserting millions of totally random non-coding DNA sequences into yeast, and observing how each random sequence affected gene expression. They focused on a particular subset of non-coding DNA sequences called promoters, which serve as binding sites for proteins that can switch nearby genes on or off.

    “This work highlights what possibilities open up when we design new kinds of experiments to generate the right data to train models,” Regev says. “In the broader sense, I believe these kinds of approaches will be important for many problems — like understanding genetic variants in regulatory regions that confer disease risk in the human genome, but also for predicting the impact of combinations of mutations, or designing new molecules.”

    Regev, Vaishnav, de Boer, and their coauthors went on to test their model’s predictive abilities in a variety of ways, in order to show how it could help demystify the evolutionary past — and possible future — of certain promoters. “Creating an accurate model was certainly an accomplishment, but, to me, it was really just a starting point,” Vaishnav explains.

    First, to determine whether their model could help with synthetic biology applications like producing antibiotics, enzymes, and food, the researchers practiced using it to design promoters that could generate desired expression levels for any gene of interest. They then scoured other scientific papers to identify fundamental evolutionary questions, in order to see if their model could help answer them. The team even went so far as to feed their model a real-world population dataset from one existing study, which contained genetic information from yeast strains around the world. In doing so, they were able to delineate thousands of years of past selection pressures that sculpted the genomes of today’s yeast.

    But, in order to create a powerful tool that could probe any genome, the researchers knew they’d need to find a way to forecast the evolution of non-coding sequences even without such a comprehensive population dataset. To address this goal, Vaishnav and his colleagues devised a computational technique that allowed them to plot the predictions from their framework onto a two-dimensional graph. This helped them show, in a remarkably simple manner, how any non-coding DNA sequence would affect gene expression and fitness, without needing to conduct any time-consuming experiments at the lab bench.

    “One of the unsolved problems in fitness landscapes was that we didn’t have an approach for visualizing them in a way that meaningfully captured the evolutionary properties of sequences,” Vaishnav explains. “I really wanted to find a way to fill that gap, and contribute to the long-standing vision of creating a complete fitness landscape.”

    Martin Taylor, a professor of genetics at the University of Edinburgh’s Medical Research Council Human Genetics Unit who was not involved in the research, says the study shows that artificial intelligence can not only predict the effect of regulatory DNA changes, but also reveal the underlying principles that govern millions of years of evolution.

    Despite the fact that the model was trained on just a fraction of yeast regulatory DNA in a few growth conditions, he’s impressed that it’s capable of making such useful predictions about the evolution of gene regulation in mammals.

    “There are obvious near-term applications, such as the custom design of regulatory DNA for yeast in brewing, baking, and biotechnology,” he explains. “But extensions of this work could also help identify disease mutations in human regulatory DNA that are currently difficult to find and largely overlooked in the clinic. This work suggests there is a bright future for AI models of gene regulation trained on richer, more complex, and more diverse datasets.”

    Even before the study was formally published, Vaishnav began receiving queries from other researchers hoping to use the model to devise non-coding DNA sequences for use in gene therapies.

    “People have been studying regulatory evolution and fitness landscapes for decades now,” Vaishnav says. “I think our framework will go a long way in answering fundamental, open questions about the evolution and evolvability of gene regulatory DNA — and even help us design biological sequences for exciting new applications.” More

  • in

    Computational modeling guides development of new materials

    Metal-organic frameworks, a class of materials with porous molecular structures, have a variety of possible applications, such as capturing harmful gases and catalyzing chemical reactions. Made of metal atoms linked by organic molecules, they can be configured in hundreds of thousands of different ways.

    To help researchers sift through all of the possible metal-organic framework (MOF) structures and help identify the ones that would be most practical for a particular application, a team of MIT computational chemists has developed a model that can analyze the features of a MOF structure and predict if it will be stable enough to be useful.

    The researchers hope that these computational predictions will help cut the development time of new MOFs.

    “This will allow researchers to test the promise of specific materials before they go through the trouble of synthesizing them,” says Heather Kulik, an associate professor of chemical engineering at MIT.

    The MIT team is now working to develop MOFs that could be used to capture methane gas and convert it to useful compounds such as fuels.

    The researchers described their new model in two papers, one in the Journal of the American Chemical Society and one in Scientific Data. Graduate students Aditya Nandy and Gianmarco Terrones are the lead authors of the Scientific Data paper, and Nandy is also the lead author of the JACS paper. Kulik is the senior author of both papers.

    Modeling structure

    MOFs consist of metal atoms joined by organic molecules called linkers to create a rigid, cage-like structure. The materials also have many pores, which makes them useful for catalyzing reactions involving gases but can also make them less structurally stable.

    “The limitation in seeing MOFs realized at industrial scale is that although we can control their properties by controlling where each atom is in the structure, they’re not necessarily that stable, as far as materials go,” Kulik says. “They’re very porous and they can degrade under realistic conditions that we need for catalysis.”

    Scientists have been working on designing MOFs for more than 20 years, and thousands of possible structures have been published. A centralized repository contains about 10,000 of these structures but is not linked to any of the published findings on the properties of those structures.

    Kulik, who specializes in using computational modeling to discover structure-property relationships of materials, wanted to take a more systematic approach to analyzing and classifying the properties of MOFs.

    “When people make these now, it’s mostly trial and error. The MOF dataset is really promising because there are so many people excited about MOFs, so there’s so much to learn from what everyone’s been working on, but at the same time, it’s very noisy and it’s not systematic the way it’s reported,” she says.

    Kulik and her colleagues set out to analyze published reports of MOF structures and properties using a natural-language-processing algorithm. Using this algorithm, they scoured nearly 4,000 published papers, extracting information on the temperature at which a given MOF would break down. They also pulled out data on whether particular MOFs can withstand the conditions needed to remove solvents used to synthesize them and make sure they become porous.

    Once the researchers had this information, they used it to train two neural networks to predict MOFs’ thermal stability and stability during solvent removal, based on the molecules’ structure.

    “Before you start working with a material and thinking about scaling it up for different applications, you want to know will it hold up, or is it going to degrade in the conditions I would want to use it in?” Kulik says. “Our goal was to get better at predicting what makes a stable MOF.”

    Better stability

    Using the model, the researchers were able to identify certain features that influence stability. In general, simpler linkers with fewer chemical groups attached to them are more stable. Pore size is also important: Before the researchers did their analysis, it had been thought that MOFs with larger pores might be too unstable. However, the MIT team found that large-pore MOFs can be stable if other aspects of their structure counteract the large pore size.

    “Since MOFs have so many things that can vary at the same time, such as the metal, the linkers, the connectivity, and the pore size, it is difficult to nail down what governs stability across different families of MOFs,” Nandy says. “Our models enable researchers to make predictions on existing or new materials, many of which have yet to be made.”

    The researchers have made their data and models available online. Scientists interested in using the models can get recommendations for strategies to make an existing MOF more stable, and they can also add their own data and feedback on the predictions of the models.

    The MIT team is now using the model to try to identify MOFs that could be used to catalyze the conversion of methane gas to methanol, which could be used as fuel. Kulik also plans to use the model to create a new dataset of hypothetical MOFs that haven’t been built before but are predicted to have high stability. Researchers could then screen this dataset for a variety of properties.

    “People are interested in MOFs for things like quantum sensing and quantum computing, all sorts of different applications where you need metals distributed in this atomically precise way,” Kulik says.

    The research was funded by DARPA, the U.S. Office of Naval Research, the U.S. Department of Energy, a National Science Foundation Graduate Research Fellowship, a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, and an AAAS Marion Milligan Mason Award. More

  • in

    MIT ReACT welcomes first Afghan cohort to its largest-yet certificate program

    Through the championing support of the faculty and leadership of the MIT Afghan Working Group convened last September by Provost Martin Schmidt and chaired by Associate Provost for International Activities Richard Lester, MIT has come together to support displaced Afghan learners and scholars in a time of crisis. The MIT Refugee Action Hub (ReACT) has opened opportunities for 25 talented Afghan learners to participate in the hub’s certificate program in computer and data science (CDS), now in its fourth year, welcoming its largest and most diverse cohort to date — 136 learners from 29 countries.

    ”Even in the face of extreme disruption, education and scholarship must continue, and MIT is committed to providing resources and safe forums for displaced scholars,” says Lester. “We greatly appreciate MIT ReACT’s work to create learning opportunities for Afghan students whose lives have been upended by the crisis in their homeland.”

    Currently, more than 3.5 million Afghans are internally displaced, while 2.5 million are registered refugees residing in other parts of the world. With millions in Afghanistan facing famine, poverty, and civil unrest in what has become the world’s largest humanitarian crisis, the United Nations predicts the number of Afghans forced to flee their homes will continue to rise. 

    “Forced displacement is on the rise, fueled not only by constant political, economical, and social turmoil worldwide, but also by the ongoing climate change crisis, which threatens costly disruptions to society and has potential to create unprecedented displacement internationally,” says associate professor of civil and environmental engineering and ReACT’s faculty founder Admir Masic. During the orientation for the new CDS cohort in January, Masic emphasized the great need for educational programs like ReACT’s that address the specific challenges refugees and displaced learners face.

    A former Bosnian refugee, Masic spent his teenage years in Croatia, where educational opportunities were limited for young people with refugee status. His experience motivated him to found ReACT, which launched in 2017. Housed within Open Learning, ReACT is an MIT-wide effort to deliver global education and professional development programs to underserved communities, including refugees and migrants. ReACT’s signature program, CDS is a year-long, online program that combines MITx courses in programming and data science, personal and professional development workshops including MIT Bootcamps, and opportunities for practical experience.

    ReACT’s group of 25 learners from Afghanistan, 52 percent of whom are women, joins the larger CDS cohort in the program. They will receive support from their new colleagues as well as members of ReACT’s mentor and alumni network. While the majority of the group are residing around the world, including in Europe, North America, and neighboring countries, several still remain in Afghanistan. With the support of the Afghan Working Group, ReACT is working to connect with communities from the region to provide safe and inclusive learning environments for the cohort. ​​

    Building community and confidence

    Selected from more than 1,000 applicants, the new CDS cohort reflected on their personal and professional goals during a weeklong orientation.

    “I am here because I want to change my career and learn basics in this field to then obtain networks that I wouldn’t have got if it weren’t for this program,” said Samiullah Ajmal, who is joining the program from Afghanistan.

    Interactive workshops on topics such as leadership development and virtual networking rounded out the week’s events. Members of ReACT’s greater community — which has grown in recent years to include a network of external collaborators including nonprofits, philanthropic supporters, universities, and alumni — helped facilitate these workshops and other orientation activities.

    For instance, Na’amal, a social enterprise that connects refugees to remote work opportunities, introduced the CDS learners to strategies for making career connections remotely. “We build confidence while doing,” says Susan Mulholland, a leadership and development coach with Na’amal who led the networking workshop.

    Along with the CDS program’s cohort-based model, ReACT also uses platforms that encourage regular communication between participants and with the larger ReACT network — making connections a critical component of the program.

    “I not only want to meet new people and make connections for my professional career, but I also want to test my communication and social skills,” says Pablo Andrés Uribe, a learner who lives in Colombia, describing ReACT’s emphasis on community-building. 

    Over the last two years, ReACT has expanded its geographic presence, growing from a hub in Jordan into a robust global community of many hubs, including in Colombia and Uganda. These regional sites connect talented refugees and displaced learners to internships and employment, startup networks and accelerators, and pathways to formal undergraduate and graduate education.

    This expansion is thanks to the generous support internally from the MIT Office of the Provost and Associate Provost Richard Lester and external organizations including the Western Union Foundation. ReACT will build new hubs this year in Greece, Uruguay, and Afghanistan, as a result of gifts from the Hatsopoulos family and the Pfeffer family.

    Holding space to learn from each other

    In addition to establishing new global hubs, ReACT plans to expand its network of internship and experiential learning opportunities, increasing outreach to new collaborators such as nongovernmental organizations (NGOs), companies, and universities. Jointly with Na’amal and Paper Airplanes, a nonprofit that connects conflict-affected individuals with personal language tutors, ReACT will host the first Migration Summit. Scheduled for April 2022, the month-long global convening invites a broad range of participants, including displaced learners, universities, companies, nonprofits and NGOs, social enterprises, foundations, philanthropists, researchers, policymakers, employers, and governments, to address the key challenges and opportunities for refugee and migrant communities. The theme of the summit is “Education and Workforce Development in Displacement.”

    “The MIT Migration Summit offers a platform to discuss how new educational models, such as those employed in ReACT, can help solve emerging challenges in providing quality education and career opportunities to forcibly displaced and marginalized people around the world,” says Masic. 

    A key goal of the convening is to center the voices of those most directly impacted by displacement, such as ReACT’s learners from Afghanistan and elsewhere, in solution-making. More

  • in

    MIT Center for Real Estate launches the Asia Real Estate Initiative

    To appreciate the explosive urbanization taking place in Asia, consider this analogy: Every 40 days, a city the equivalent size of Boston is built in Asia. Of the $24.7 trillion real estate investment opportunities predicted by 2030 in emerging cities, $17.8 trillion (72 percent) will be in Asia. While this growth is exciting to the real estate industry, it brings with it the attendant social and environmental issues.

    To promote a sustainable and innovative approach to this growth, leadership at the MIT Center for Real Estate (MIT CRE) recently established the Asia Real Estate Initiative (AREI), which aims to become a platform for industry leaders, entrepreneurs, and the academic community to find solutions to the practical concerns of real estate development across these countries.

    “Behind the creation of this initiative is the understanding that Asia is a living lab for the study of future global urban development,” says Hashim Sarkis, dean of the MIT School of Architecture and Planning.

    An investment in cities of the future

    One of the areas in AREI’s scope of focus is connecting sustainability and technology in real estate.

    “We believe the real estate sector should work cooperatively with the energy, science, and technology sectors to solve the climate challenges,” says Richard Lester, the Institute’s associate provost for international activities. “AREI will engage academics and industry leaders, nongovernment organizations, and civic leaders globally and in Asia, to advance sharing knowledge and research.”

    In its effort to understand how trends and new technologies will impact the future of real estate, AREI has received initial support from a prominent alumnus of MIT CRE who wishes to remain anonymous. The gift will support a cohort of researchers working on innovative technologies applicable to advancing real estate sustainability goals, with a special focus on the global and Asia markets. The call for applications is already under way, with AREI seeking to collaborate with scholars who have backgrounds in economics, finance, urban planning, technology, engineering, and other disciplines.

    “The research on real estate sustainability and technology could transform this industry and help invent global real estate of the future,” says Professor Siqi Zheng, faculty director of MIT CRE and AREI faculty chair. “The pairing of real estate and technology often leads to innovative and differential real estate development strategies such as buildings that are green, smart, and healthy.”

    The initiative arrives at a key time to make a significant impact and cement a leadership role in real estate development across Asia. MIT CRE is positioned to help the industry increase its efficiency and social responsibility, with nearly 40 years of pioneering research in the field. Zheng, an established scholar with expertise on urban growth in fast-urbanizing regions, is the former president of the Asia Real Estate Society and sits on the Board of American Real Estate and Urban Economics Association. Her research has been supported by international institutions including the World Bank, the Asian Development Bank, and the Lincoln Institute of Land Policy.

    “The researchers in AREI are now working on three interrelated themes: the future of real estate and live-work-play dynamics; connecting sustainability and technology in real estate; and innovations in real estate finance and business,” says Zheng.

    The first theme has already yielded a book — “Toward Urban Economic Vibrancy: Patterns and Practices in Asia’s New Cities” — recently published by SA+P Press.

    Engaging thought leaders and global stakeholders

    AREI also plans to collaborate with counterparts in Asia to contribute to research, education, and industry dialogue to meet the challenges of sustainable city-making across the continent and identify areas for innovation. Traditionally, real estate has been a very local business with a lengthy value chain, according to Zhengzhen Tan, director of AREI. Most developers focused their career on one particular product type in one particular regional market. AREI is working to change that dynamic.

    “We want to create a cross-border dialogue within Asia and among Asia, North America, and European leaders to exchange knowledge and practices,” says Tan. “The real estate industry’s learning costs are very high compared to other sectors. Collective learning will reduce the cost of failure and have a significant impact on these global issues.”

    The 2021 United Nations Climate Change Conference in Glasgow shed additional light on environmental commitments being made by governments in Asia. With real estate representing 40 percent of global greenhouse gas emissions, the Asian real estate market is undergoing an urgent transformation to deliver on this commitment.

    “One of the most pressing calls is to get to net-zero emissions for real estate development and operation,” says Tan. “Real estate investors and developers are making short- and long-term choices that are locking in environmental footprints for the ‘decisive decade.’ We hope to inspire developers and investors to think differently and get out of their comfort zone.” More

  • in

    Unlocking new doors to artificial intelligence

    Artificial intelligence research is constantly developing new hypotheses that have the potential to benefit society and industry; however, sometimes these benefits are not fully realized due to a lack of engineering tools. To help bridge this gap, graduate students in the MIT Department of Electrical Engineering and Computer Science’s 6-A Master of Engineering (MEng) Thesis Program work with some of the most innovative companies in the world and collaborate on cutting-edge projects, while contributing to and completing their MEng thesis.

    During a portion of the last year, four 6-A MEng students teamed up and completed an internship with IBM Research’s advanced prototyping team through the MIT-IBM Watson AI Lab on AI projects, often developing web applications to solve a real-world issue or business use cases. Here, the students worked alongside AI engineers, user experience engineers, full-stack researchers, and generalists to accommodate project requests and receive thesis advice, says Lee Martie, IBM research staff member and 6-A manager. The students’ projects ranged from generating synthetic data to allow for privacy-sensitive data analysis to using computer vision to identify actions in video that allows for monitoring human safety and tracking build progress on a construction site.

    “I appreciated all of the expertise from the team and the feedback,” says 6-A graduate Violetta Jusiega ’21, who participated in the program. “I think that working in industry gives the lens of making sure that the project’s needs are satisfied and [provides the opportunity] to ground research and make sure that it is helpful for some use case in the future.”

    Jusiega’s research intersected the fields of computer vision and design to focus on data visualization and user interfaces for the medical field. Working with IBM, she built an application programming interface (API) that let clinicians interact with a medical treatment strategy AI model, which was deployed in the cloud. Her interface provided a medical decision tree, as well as some prescribed treatment plans. After receiving feedback on her design from physicians at a local hospital, Jusiega developed iterations of the API and how the results where displayed, visually, so that it would be user-friendly and understandable for clinicians, who don’t usually code. She says that, “these tools are often not acquired into the field because they lack some of these API principles which become more important in an industry where everything is already very fast paced, so there’s little time to incorporate a new technology.” But this project might eventually allow for industry deployment. “I think this application has a bunch of potential, whether it does get picked up by clinicians or whether it’s simply used in research. It’s very promising and very exciting to see how technology can help us modify, or I can improve, the health-care field to be even more custom-tailored towards patients and giving them the best care possible,” she says.

    Another 6-A graduate student, Spencer Compton, was also considering aiding professionals to make more informed decisions, for use in settings including health care, but he was tackling it from a causal perspective. When given a set of related variables, Compton was investigating if there was a way to determine not just correlation, but the cause-and-effect relationship between them (the direction of the interaction) from the data alone. For this, he and his collaborators from IBM Research and Purdue University turned to a field of math called information theory. With the goal of designing an algorithm to learn complex networks of causal relationships, Compton used ideas relating to entropy, the randomness in a system, to help determine if a causal relationship is present and how variables might be interacting. “When judging an explanation, people often default to Occam’s razor” says Compton. “We’re more inclined to believe a simpler explanation than a more complex one.” In many cases, he says, it seemed to perform well. For instance, they were able to consider variables such as lung cancer, pollution, and X-ray findings. He was pleased that his research allowed him to help create a framework of “entropic causal inference” that could aid in safe and smart decisions in the future, in a satisfying way. “The math is really surprisingly deep, interesting, and complex,” says Compton. “We’re basically asking, ‘when is the simplest explanation correct?’ but as a math question.”

    Determining relationships within data can sometimes require large volumes of it to suss out patterns, but for data that may contain sensitive information, this may not be available. For her master’s work, Ivy Huang worked with IBM Research to generate synthetic tabular data using a natural language processing tool called a transformer model, which can learn and predict future values from past values. Trained on real data, the model can produce new data with similar patterns, properties, and relationships without restrictions like privacy, availability, and access that might come with real data in financial transactions and electronic medical records. Further, she created an API and deployed the model in an IBM cluster, which allowed users increased access to the model and abilities to query it without compromising the original data.

    Working with the advanced prototyping team, MEng candidate Brandon Perez also considered how to gather and investigate data with restrictions, but in his case it was to use computer vision frameworks, centered on an action recognition model, to identify construction site happenings. The team based their work on the Moments in Time dataset, which contains over a million three-second video clips with about 300 attached classification labels, and has performed well during AI training. However, the group needed more construction-based video data. For this, they used YouTube-8M. Perez built a framework for testing and fine-tuning existing object detection models and action recognition models that could plug into an automatic spatial and temporal localization tool — how they would identify and label particular actions in a video timeline. “I was satisfied that I was able to explore what made me curious, and I was grateful for the autonomy that I was given with this project,” says Perez. “I felt like I was always supported, and my mentor was a great support to the project.”

    “The kind of collaborations that we have seen between our MEng students and IBM researchers are exactly what the 6-A MEng Thesis program at MIT is all about,” says Tomas Palacios, professor of electrical engineering and faculty director of the MIT 6-A MEng Thesis program. “For more than 100 years, 6-A has been connecting MIT students with industry to solve together some of the most important problems in the world.” More

  • in

    3 Questions: Fotini Christia on racial equity and data science

    Fotini Christia is the Ford International Professor in the Social Sciences in the Department of Political Science, associate director of the Institute for Data, Systems, and Society (IDSS), and director of the Sociotechnical Systems Research Center (SSRC). Her research interests include issues of conflict and cooperation in the Muslim world, and she has conducted fieldwork in Afghanistan, Bosnia, Iran, the Palestinian Territories, Syria, and Yemen. She has co-organized the IDSS Research Initiative on Combatting Systemic Racism (ICSR), which works to bridge the social sciences, data science, and computation by bringing researchers from these disciplines together to address systemic racism across housing, health care, policing, education, employment, and other sectors of society.

    Q: What is the IDSS/ICSR approach to systemic racism research?

    A: The Research Initiative on Combatting Systemic Racism (ICSR) aims to seed and coordinate cross-disciplinary research to identify and overcome racially discriminatory processes and outcomes across a range of U.S. institutions and policy domains.

    Building off the extensive social science literature on systemic racism, the focus of this research initiative is to use big data to develop and harness computational tools that can help effect structural and normative change toward racial equity.

    The initiative aims to create a visible presence at MIT for cutting-edge computational research with a racial equity lens, across societal domains that will attract and train students and scholars.

    The steering committee for this research initiative is composed of underrepresented minority faculty members from across MIT’s five schools and the MIT Schwarzman College of Computing. Members will serve as close advisors to the initiative as well as share the findings of our work beyond MIT’s campus. MIT Chancellor Melissa Nobles heads this committee.

    Q: What role can data science play in helping to effect change toward racial equity?

    A: Existing work has shown racial discrimination in the job market, in the criminal justice system, as well as in education, health care, and access to housing, among other places. It has also underlined how algorithms could further entrench such bias — be it in training data or in the people who build them. Data science tools can not only help identify, but also contribute to, proposing fixes on racially inequitable outcomes that result from implicit or explicit biases in governing institutional practices in the public and private sector, and more recently from the use of AI and algorithmic methods in decision-making.

    To that effect, this initiative will produce research that explores and collects the relevant big data across domains, while paying attention to the ways such data are collected, and focus on improving and developing data-driven computational tools to address racial disparities in structures and institutions that have reproduced racially discriminatory outcomes in American society.

    The strong correlation between race, class, educational attainment, and various attitudes and behaviors in the American context can make it extremely difficult to rule out the influence of confounding factors. Thus, a key motivation for our research initiative is to highlight the importance of causal analysis using computational methods, and focus on understanding the opportunities of big data and algorithmic decision-making to address racial inequities and promote racial justice — beyond de-biasing algorithms. The intent is to also codify methodologies on equity-informed research practices and produce tools that are clear on the quantifiable expected social costs and benefits, as well as on the downstream effects on systemic racism more broadly.

    Q: What are some ways that the ICSR might conduct or follow-up on research seeking real-world impact or policy change?

    A: This type of research has ethical and societal considerations at its core, especially as they pertain to historically disadvantaged groups in the U.S., and will be coordinated with and communicated to local stakeholders to drive relevant policy decisions. This initiative intends to establish connections to URM [underrepresented minority] researchers and students at underrepresented universities and to directly collaborate with them on these research efforts. To that effect, we are leveraging existing programs such as the MIT Summer Research Program (MSRP).

    To ensure that our research targets the right problems bringing a racial equity lens with an interest to effect policy change, we will also connect with community organizations in minority neighborhoods who often bear the brunt of the direct and indirect effects of systemic racism, as well as with local government offices that work to address inequity in service provision in these communities. Our intent is to directly engage IDSS students with these organizations to help develop and test algorithmic tools for racial equity. More