More stories

  • in

    “We offer another place for knowledge”

    In the Dzaleka Refugee Camp in Malawi, Jospin Hassan didn’t have access to the education opportunities he sought. So, he decided to create his own. 

    Hassan knew the booming fields of data science and artificial intelligence could bring job opportunities to his community and help solve local challenges. After earning a spot in the 2020-21 cohort of the Certificate Program in Computer and Data Science from MIT Refugee Action Hub (ReACT), Hassan started sharing MIT knowledge and skills with other motivated learners in Dzaleka.

    MIT ReACT is now Emerging Talent, part of the Jameel World Education Lab (J-WEL) at MIT Open Learning. Currently serving its fifth cohort of global learners, Emerging Talent’s year-long certificate program incorporates high-quality computer science and data analysis coursework from MITx, professional skill building, experiential learning, apprenticeship work, and opportunities for networking with MIT’s global community of innovators. Hassan’s cohort honed their leadership skills through interactive online workshops with J-WEL and the 10-week online MIT Innovation Leadership Bootcamp. 

    “My biggest takeaway was networking, collaboration, and learning from each other,” Hassan says.

    Today, Hassan’s organization ADAI Circle offers mentorship and education programs for youth and other job seekers in the Dzaleka Refugee Camp. The curriculum encourages hands-on learning and collaboration.

    Launched in 2020, ADAI Circle aims to foster job creation and reduce poverty in Malawi through technology and innovation. In addition to their classes in data science, AI, software development, and hardware design, their Innovation Hub offers internet access to anyone in need. 

    Doing something different in the community

    Hassan first had the idea for his organization in 2018 when he reached a barrier in his own education journey. There were several programs in the Dzaleka Refugee Camp teaching learners how to code websites and mobile apps, but Hassan felt that they were limited in scope. 

    “We had good devices and internet access,” he says, “but I wanted to learn something new.” 

    Teaming up with co-founder Patrick Byamasu, Hassan and Byamasu set their sights on the longevity of AI and how that might create more jobs for people in their community. “The world is changing every day, and data scientists are in a higher demand today in various companies,” Hassan says. “For this reason, I decided to expand and share the knowledge that I acquired with my fellow refugees and the surrounding villages.”

    ADAI Circle draws inspiration from Hassan’s own experience with MIT Emerging Talent coursework, community, and training opportunities. For example, the MIT Bootcamps model is now standard practice for ADAI Circle’s annual hackathon. Hassan first introduced the hackathon to ADAI Circle students as part of his final experiential learning project of the Emerging Talent certificate program. 

    ADAI Circle’s annual hackathon is now an interactive — and effective — way to select students who will most benefit from its programs. The local schools’ curricula, Hassan says, might not provide enough of an academic challenge. “We can’t teach everyone and accommodate everyone because there are a lot of schools,” Hassan says, “but we offer another place for knowledge.” 

    The hackathon helps students develop data science and robotics skills. Before they start coding, students have to convince ADAI Circle teachers that their designs are viable, answering questions like, “What problem are you solving?” and “How will this help the community?” A community-oriented mindset is just as important to the curriculum.

    In addition to the practical skills Hassan gained from Emerging Talent, he leveraged the program’s network to help his community. Thanks to a social media connection Hassan made with the nongovernmental organization Give Internet after one of Emerging Talent’s virtual events, Give Internet brought internet access to ADAI Circle.

    Bridging the AI gap to unmet communities

    In 2023, ADAI Circle connected with another MIT Open Learning program, Responsible AI for Social Empowerment and Education (RAISE), which led to a pilot test of a project-based AI curriculum for middle school students. The Responsible AI for Computational Action (RAICA) curriculum equipped ADAI Circle students with AI skills for chatbots and natural language processing. 

    “I liked that program because it was based on what we’re teaching at the center,” Hassan says, speaking of his organization’s mission of bridging the AI gap to reach unmet communities.

    The RAICA curriculum was designed by education experts at MIT Scheller Teacher Education Program (STEP Lab) and AI experts from MIT Personal Robots group and MIT App Inventor. ADAI Circle teachers gave detailed feedback about the pilot to the RAICA team. During weekly meetings with Glenda Stump, education research scientist for RAICA and J-WEL, and Angela Daniel, teacher development specialist for RAICA, the teachers discussed their experiences, prepared for upcoming lessons, and translated the learning materials in real time. 

    “We are trying to create a curriculum that’s accessible worldwide and to students who typically have little or no access to technology,” says Mary Cate Gustafson-Quiett, curriculum design manager at STEP Lab and project manager for RAICA. “Working with ADAI and students in a refugee camp challenged us to design in more culturally and technologically inclusive ways.”

    Gustafson-Quiett says the curriculum feedback from ADAI Circle helped inform how RAICA delivers teacher development resources to accommodate learning environments with limited internet access. “They also exposed places where our team’s western ideals, specifically around individualism, crept into activities in the lesson and contrasted with their more communal cultural beliefs,” she says.

    Eager to introduce more MIT-developed AI resources, Hassan also shared MIT RAISE’s Day of AI curricula with ADAI Circle teachers. The new ChatGPT module gave students the chance to level up their chatbot programming skills that they gained from the RAICA module. Some of the advanced students are taking initiative to use ChatGPT API to create their own projects in education.

    “We don’t want to tell them what to do, we want them to come up with their own ideas,” Hassan says.

    Although ADAI Circle faces many challenges, Hassan says his team is addressing them one by one. Last year, they didn’t have electricity in their Innovation Hub, but they solved that. This year, they achieved a stable internet connection that’s one of the fastest in Malawi. Next up, they are hoping to secure more devices for their students, create more jobs, and add additional hubs throughout the community. The work is never done, but Hassan is starting to see the impact that ADAI Circle is making. 

    “For those who want to learn data science, let’s let them learn,” Hassan says. More

  • in

    Day of AI curriculum meets the moment

    MIT Responsible AI for Social Empowerment and Education (RAISE) recently celebrated the second annual Day of AI with two flagship local events. The Edward M. Kennedy Institute for the U.S. Senate in Boston hosted a human rights and data policy-focused event that was streamed worldwide. Dearborn STEM Academy in Roxbury, Massachusetts, hosted a student workshop in collaboration with Amazon Future Engineer. With over 8,000 registrations across all 50 U.S. states and 108 countries in 2023, participation in Day of AI has more than doubled since its inaugural year.

    Day of AI is a free curriculum of lessons and hands-on activities designed to teach kids of all ages and backgrounds the basics and responsible use of artificial intelligence, designed by researchers at MIT RAISE. This year, resources were available for educators to run at any time and in any increments they chose. The curriculum included five new modules to address timely topics like ChatGPT in School, Teachable Machines, AI and Social Media, Data Science and Me, and more. A collaboration with the International Society for Technology in Education also introduced modules for early elementary students. Educators across the world shared photos, videos, and stories of their students’ engagement, expressing excitement and even relief over the accessible lessons.

    Professor Cynthia Breazeal, director of RAISE, dean for digital learning at MIT, and head of the MIT Media Lab’s Personal Robots research group, said, “It’s been a year of extraordinary advancements in AI, and with that comes necessary conversations and concerns about who and what this technology is for. With our Day of AI events, we want to celebrate the teachers and students who are putting in the work to make sure that AI is for everyone.”

    Reflecting community values and protecting digital citizens

    Play video

    On May 18, 2023, MIT RAISE hosted a global Day of AI celebration featuring a flagship local event focused on human rights and data policy at the Edward M. Kennedy Institute for the U.S. Senate. Students from the Warren Prescott Middle School and New Mission High School heard from speakers the City of Boston, Liberty Mutual, and MIT to discuss the many benefits and challenges of artificial intelligence education. Video: MIT Open Learning

    MIT President Sally Kornbluth welcomed students from Warren Prescott Middle School and New Mission High School to the Day of AI program at the Edward M. Kennedy Institute. Kornbluth reflected on the exciting potential of AI, along with the ethical considerations society needs to be responsible for.

    “AI has the potential to do all kinds of fantastic things, including driving a car, helping us with the climate crisis, improving health care, and designing apps that we can’t even imagine yet. But what we have to make sure it doesn’t do is cause harm to individuals, to communities, to us — society as a whole,” she said.

    This theme resonated with each of the event speakers, whose jobs spanned the sectors of education, government, and business. Yo Deshpande, technologist for the public realm, and Michael Lawrence Evans, program director of new urban mechanics from the Boston Mayor’s Office, shared how Boston thinks about using AI to improve city life in ways that are “equitable, accessible, and delightful.” Deshpande said, “We have the opportunity to explore not only how AI works, but how using AI can line up with our values, the way we want to be in the world, and the way we want to be in our community.”

    Adam L’Italien, chief innovation officer at Liberty Mutual Insurance (one of Day of AI’s founding sponsors), compared our present moment with AI technologies to the early days of personal computers and internet connection. “Exposure to emerging technologies can accelerate progress in the world and in your own lives,” L’Italien said, while recognizing that the AI development process needs to be inclusive and mitigate biases.

    Human policies for artificial intelligence

    So how does society address these human rights concerns about AI? Marc Aidinoff ’21, former White House Office of Science and Technology Policy chief of staff, led a discussion on how government policy can influence the parameters of how technology is developed and used, like the Blueprint for an AI Bill of Rights. Aidinoff said, “The work of building the world you want to see is far harder than building the technical AI system … How do you work with other people and create a collective vision for what we want to do?” Warren Prescott Middle School students described how AI could be used to solve problems that humans couldn’t. But they also shared their concerns that AI could affect data privacy, learning deficits, social media addiction, job displacement, and propaganda.

    In a mock U.S. Senate trial activity designed by Daniella DiPaola, PhD student at the MIT Media Lab, the middle schoolers investigated what rights might be undermined by AI in schools, hospitals, law enforcement, and corporations. Meanwhile, New Mission High School students workshopped the ideas behind bill S.2314, the Social Media Addiction Reduction Technology (SMART) Act, in an activity designed by Raechel Walker, graduate research assistant in the Personal Robots Group, and Matt Taylor, research assistant at the Media Lab. They discussed what level of control could or should be introduced at the parental, educational, and governmental levels to reduce the risks of internet addiction.

    “Alexa, how do I program AI?”

    Play video

    The 2023 Day of AI celebration featured a flagship local event at the Dearborn STEM Academy in Roxbury in collaboration with Amazon Future Engineer. Students participated in a hands-on activity using MIT App Inventor as part of Day of AI’s Alexa lesson. Video: MIT Open Learning

    At Dearborn STEM Academy, Amazon Future Engineer helped students work through the Intro to Voice AI curriculum module in real-time. Students used MIT App Inventor to code basic commands for Alexa. In an interview with WCVB, Principal Darlene Marcano said, “It’s important that we expose our students to as many different experiences as possible. The students that are participating are on track to be future computer scientists and engineers.”

    Breazeal told Dearborn students, “We want you to have an informed voice about how you want AI to be used in society. We want you to feel empowered that you can shape the world. You can make things with AI to help make a better world and a better community.”

    Rohit Prasad ’08, senior vice president and head scientist for Alexa at Amazon, and Victor Reinoso ’97, global director of philanthropic education initiatives at Amazon, also joined the event. “Amazon and MIT share a commitment to helping students discover a world of possibilities through STEM and AI education,” said Reinoso. “There’s a lot of current excitement around the technological revolution with generative AI and large language models, so we’re excited to help students explore careers of the future and navigate the pathways available to them.” To highlight their continued investment in the local community and the school program, Amazon donated a $25,000 Innovation and Early College Pathways Program Grant to the Boston Public School system.

    Day of AI down under

    Not only was the Day of AI program widely adopted across the globe, Australian educators were inspired to adapt their own regionally specific curriculum. An estimated 161,000 AI professionals will be needed in Australia by 2030, according to the National Artificial Intelligence Center in the Commonwealth Scientific and Industrial Research Organization (CSIRO), an Australian government agency and Day of AI Australia project partner. CSIRO worked with the University of New South Wales to develop supplementary educational resources on AI ethics and machine learning. Day of AI Australia reached 85,000 students at 400-plus secondary schools this year, sparking curiosity in the next generation of AI experts.

    The interest in AI is accelerating as fast as the technology is being developed. Day of AI offers a unique opportunity for K-12 students to shape our world’s digital future and their own.

    “I hope that some of you will decide to be part of this bigger effort to help us figure out the best possible answers to questions that are raised by AI,” Kornbluth told students at the Edward M. Kennedy Institute. “We’re counting on you, the next generation, to learn how AI works and help make sure it’s for everyone.” More

  • in

    Researchers create a tool for accurately simulating complex systems

    Researchers often use simulations when designing new algorithms, since testing ideas in the real world can be both costly and risky. But since it’s impossible to capture every detail of a complex system in a simulation, they typically collect a small amount of real data that they replay while simulating the components they want to study.

    Known as trace-driven simulation (the small pieces of real data are called traces), this method sometimes results in biased outcomes. This means researchers might unknowingly choose an algorithm that is not the best one they evaluated, and which will perform worse on real data than the simulation predicted that it should.

    MIT researchers have developed a new method that eliminates this source of bias in trace-driven simulation. By enabling unbiased trace-driven simulations, the new technique could help researchers design better algorithms for a variety of applications, including improving video quality on the internet and increasing the performance of data processing systems.

    The researchers’ machine-learning algorithm draws on the principles of causality to learn how the data traces were affected by the behavior of the system. In this way, they can replay the correct, unbiased version of the trace during the simulation.

    When compared to a previously developed trace-driven simulator, the researchers’ simulation method correctly predicted which newly designed algorithm would be best for video streaming — meaning the one that led to less rebuffering and higher visual quality. Existing simulators that do not account for bias would have pointed researchers to a worse-performing algorithm.

    “Data are not the only thing that matter. The story behind how the data are generated and collected is also important. If you want to answer a counterfactual question, you need to know the underlying data generation story so you only intervene on those things that you really want to simulate,” says Arash Nasr-Esfahany, an electrical engineering and computer science (EECS) graduate student and co-lead author of a paper on this new technique.

    He is joined on the paper by co-lead authors and fellow EECS graduate students Abdullah Alomar and Pouya Hamadanian; recent graduate student Anish Agarwal PhD ’21; and senior authors Mohammad Alizadeh, an associate professor of electrical engineering and computer science; and Devavrat Shah, the Andrew and Erna Viterbi Professor in EECS and a member of the Institute for Data, Systems, and Society and of the Laboratory for Information and Decision Systems. The research was recently presented at the USENIX Symposium on Networked Systems Design and Implementation.

    Specious simulations

    The MIT researchers studied trace-driven simulation in the context of video streaming applications.

    In video streaming, an adaptive bitrate algorithm continually decides the video quality, or bitrate, to transfer to a device based on real-time data on the user’s bandwidth. To test how different adaptive bitrate algorithms impact network performance, researchers can collect real data from users during a video stream for a trace-driven simulation.

    They use these traces to simulate what would have happened to network performance had the platform used a different adaptive bitrate algorithm in the same underlying conditions.

    Researchers have traditionally assumed that trace data are exogenous, meaning they aren’t affected by factors that are changed during the simulation. They would assume that, during the period when they collected the network performance data, the choices the bitrate adaptation algorithm made did not affect those data.

    But this is often a false assumption that results in biases about the behavior of new algorithms, making the simulation invalid, Alizadeh explains.

    “We recognized, and others have recognized, that this way of doing simulation can induce errors. But I don’t think people necessarily knew how significant those errors could be,” he says.

    To develop a solution, Alizadeh and his collaborators framed the issue as a causal inference problem. To collect an unbiased trace, one must understand the different causes that affect the observed data. Some causes are intrinsic to a system, while others are affected by the actions being taken.

    In the video streaming example, network performance is affected by the choices the bitrate adaptation algorithm made — but it’s also affected by intrinsic elements, like network capacity.

    “Our task is to disentangle these two effects, to try to understand what aspects of the behavior we are seeing are intrinsic to the system and how much of what we are observing is based on the actions that were taken. If we can disentangle these two effects, then we can do unbiased simulations,” he says.

    Learning from data

    But researchers often cannot directly observe intrinsic properties. This is where the new tool, called CausalSim, comes in. The algorithm can learn the underlying characteristics of a system using only the trace data.

    CausalSim takes trace data that were collected through a randomized control trial, and estimates the underlying functions that produced those data. The model tells the researchers, under the exact same underlying conditions that a user experienced, how a new algorithm would change the outcome.

    Using a typical trace-driven simulator, bias might lead a researcher to select a worse-performing algorithm, even though the simulation indicates it should be better. CausalSim helps researchers select the best algorithm that was tested.

    The MIT researchers observed this in practice. When they used CausalSim to design an improved bitrate adaptation algorithm, it led them to select a new variant that had a stall rate that was nearly 1.4 times lower than a well-accepted competing algorithm, while achieving the same video quality. The stall rate is the amount of time a user spent rebuffering the video.

    By contrast, an expert-designed trace-driven simulator predicted the opposite. It indicated that this new variant should cause a stall rate that was nearly 1.3 times higher. The researchers tested the algorithm on real-world video streaming and confirmed that CausalSim was correct.

    “The gains we were getting in the new variant were very close to CausalSim’s prediction, while the expert simulator was way off. This is really exciting because this expert-designed simulator has been used in research for the past decade. If CausalSim can so clearly be better than this, who knows what we can do with it?” says Hamadanian.

    During a 10-month experiment, CausalSim consistently improved simulation accuracy, resulting in algorithms that made about half as many errors as those designed using baseline methods.

    In the future, the researchers want to apply CausalSim to situations where randomized control trial data are not available or where it is especially difficult to recover the causal dynamics of the system. They also want to explore how to design and monitor systems to make them more amenable to causal analysis. More

  • in

    A new chip for decoding data transmissions demonstrates record-breaking energy efficiency

    Imagine using an online banking app to deposit money into your account. Like all information sent over the internet, those communications could be corrupted by noise that inserts errors into the data.

    To overcome this problem, senders encode data before they are transmitted, and then a receiver uses a decoding algorithm to correct errors and recover the original message. In some instances, data are received with reliability information that helps the decoder figure out which parts of a transmission are likely errors.

    Researchers at MIT and elsewhere have developed a decoder chip that employs a new statistical model to use this reliability information in a way that is much simpler and faster than conventional techniques.

    Their chip uses a universal decoding algorithm the team previously developed, which can unravel any error correcting code. Typically, decoding hardware can only process one particular type of code. This new, universal decoder chip has broken the record for energy-efficient decoding, performing between 10 and 100 times better than other hardware.

    This advance could enable mobile devices with fewer chips, since they would no longer need separate hardware for multiple codes. This would reduce the amount of material needed for fabrication, cutting costs and improving sustainability. By making the decoding process less energy intensive, the chip could also improve device performance and lengthen battery life. It could be especially useful for demanding applications like augmented and virtual reality and 5G networks.

    “This is the first time anyone has broken below the 1 picojoule-per-bit barrier for decoding. That is roughly the same amount of energy you need to transmit a bit inside the system. It had been a big symbolic threshold, but it also changes the balance in the receiver of what might be the most pressing part from an energy perspective — we can move that away from the decoder to other elements,” says Muriel Médard, the School of Science NEC Professor of Software Science and Engineering, a professor in the Department of Electrical Engineering and Computer Science, and a co-author of a paper presenting the new chip.

    Médard’s co-authors include lead author Arslan Riaz, a graduate student at Boston University (BU); Rabia Tugce Yazicigil, assistant professor of electrical and computer engineering at BU; and Ken R. Duffy, then director of the Hamilton Institute at Maynooth University and now a professor at Northeastern University, as well as others from MIT, BU, and Maynooth University. The work is being presented at the International Solid-States Circuits Conference.

    Smarter sorting

    Digital data are transmitted over a network in the form of bits (0s and 1s). A sender encodes data by adding an error-correcting code, which is a redundant string of 0s and 1s that can be viewed as a hash. Information about this hash is held in a specific code book. A decoding algorithm at the receiver, designed for this particular code, uses its code book and the hash structure to retrieve the original information, which may have been jumbled by noise. Since each algorithm is code-specific, and most require dedicated hardware, a device would need many chips to decode different codes.

    The researchers previously demonstrated GRAND (Guessing Random Additive Noise Decoding), a universal decoding algorithm that can crack any code. GRAND works by guessing the noise that affected the transmission, subtracting that noise pattern from the received data, and then checking what remains in a code book. It guesses a series of noise patterns in the order they are likely to occur.

    Data are often received with reliability information, also called soft information, that helps a decoder figure out which pieces are errors. The new decoding chip, called ORBGRAND (Ordered Reliability Bits GRAND), uses this reliability information to sort data based on how likely each bit is to be an error.

    But it isn’t as simple as ordering single bits. While the most unreliable bit might be the likeliest error, perhaps the third and fourth most unreliable bits together are as likely to be an error as the seventh-most unreliable bit. ORBGRAND uses a new statistical model that can sort bits in this fashion, considering that multiple bits together are as likely to be an error as some single bits.

    “If your car isn’t working, soft information might tell you that it is probably the battery. But if it isn’t the battery alone, maybe it is the battery and the alternator together that are causing the problem. This is how a rational person would troubleshoot — you’d say that it could actually be these two things together before going down the list to something that is much less likely,” Médard says.

    This is a much more efficient approach than traditional decoders, which would instead look at the code structure and have a performance that is generally designed for the worst-case.

    “With a traditional decoder, you’d pull out the blueprint of the car and examine each and every piece. You’ll find the problem, but it will take you a long time and you’ll get very frustrated,” Médard explains.

    ORBGRAND stops sorting as soon as a code word is found, which is often very soon. The chip also employs parallelization, generating and testing multiple noise patterns simultaneously so it finds the code word faster. Because the decoder stops working once it finds the code word, its energy consumption stays low even though it runs multiple processes simultaneously.

    Record-breaking efficiency

    When they compared their approach to other chips, ORBGRAND decoded with maximum accuracy while consuming only 0.76 picojoules of energy per bit, breaking the previous performance record. ORBGRAND consumes between 10 and 100 times less energy than other devices.

    One of the biggest challenges of developing the new chip came from this reduced energy consumption, Médard says. With ORBGRAND, generating noise sequences is now so energy-efficient that other processes the researchers hadn’t focused on before, like checking the code word in a code book, consume most of the effort.

    “Now, this checking process, which is like turning on the car to see if it works, is the hardest part. So, we need to find more efficient ways to do that,” she says.

    The team is also exploring ways to change the modulation of transmissions so they can take advantage of the improved efficiency of the ORBGRAND chip. They also plan to see how their technique could be utilized to more efficiently manage multiple transmissions that overlap.

    The research is funded, in part, by the U.S. Defense Advanced Research Projects Agency (DARPA) and Science Foundation Ireland. More

  • in

    A faster way to preserve privacy online

    Searching the internet can reveal information a user would rather keep private. For instance, when someone looks up medical symptoms online, they could reveal their health conditions to Google, an online medical database like WebMD, and perhaps hundreds of these companies’ advertisers and business partners.

    For decades, researchers have been crafting techniques that enable users to search for and retrieve information from a database privately, but these methods remain too slow to be effectively used in practice.

    MIT researchers have now developed a scheme for private information retrieval that is about 30 times faster than other comparable methods. Their technique enables a user to search an online database without revealing their query to the server. Moreover, it is driven by a simple algorithm that would be easier to implement than the more complicated approaches from previous work.

    Their technique could enable private communication by preventing a messaging app from knowing what users are saying or who they are talking to. It could also be used to fetch relevant online ads without advertising servers learning a users’ interests.

    “This work is really about giving users back some control over their own data. In the long run, we’d like browsing the web to be as private as browsing a library. This work doesn’t achieve that yet, but it starts building the tools to let us do this sort of thing quickly and efficiently in practice,” says Alexandra Henzinger, a computer science graduate student and lead author of a paper introducing the technique.

    Co-authors include Matthew Hong, an MIT computer science graduate student; Henry Corrigan-Gibbs, the Douglas Ross Career Development Professor of Software Technology in the MIT Department of Electrical Engineering and Computer Science (EECS) and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); Sarah Meiklejohn, a professor in cryptography and security at University College London and a staff research scientist at Google; and senior author Vinod Vaikuntanathan, an EECS professor and principal investigator in CSAIL. The research will be presented at the 2023 USENIX Security Symposium. 

    Preserving privacy

    The first schemes for private information retrieval were developed in the 1990s, partly by researchers at MIT. These techniques enable a user to communicate with a remote server that holds a database, and read records from that database without the server knowing what the user is reading.

    To preserve privacy, these techniques force the server to touch every single item in the database, so it can’t tell which entry a user is searching for. If one area is left untouched, the server would learn that the client is not interested in that item. But touching every item when there may be millions of database entries slows down the query process.

    To speed things up, the MIT researchers developed a protocol, known as Simple PIR, in which the server performs much of the underlying cryptographic work in advance, before a client even sends a query. This preprocessing step produces a data structure that holds compressed information about the database contents, and which the client downloads before sending a query.

    In a sense, this data structure is like a hint for the client about what is in the database.

    “Once the client has this hint, it can make an unbounded number of queries, and these queries are going to be much smaller in both the size of the messages you are sending and the work that you need the server to do. This is what makes Simple PIR so much faster,” Henzinger explains.

    But the hint can be relatively large in size. For example, to query a 1-gigabyte database, the client would need to download a 124-megabyte hint. This drives up communication costs, which could make the technique difficult to implement on real-world devices.

    To reduce the size of the hint, the researchers developed a second technique, known as Double PIR, that basically involves running the Simple PIR scheme twice. This produces a much more compact hint that is fixed in size for any database.

    Using Double PIR, the hint for a 1 gigabyte database would only be 16 megabytes.

    “Our Double PIR scheme runs a little bit slower, but it will have much lower communication costs. For some applications, this is going to be a desirable tradeoff,” Henzinger says.

    Hitting the speed limit

    They tested the Simple PIR and Double PIR schemes by applying them to a task in which a client seeks to audit a specific piece of information about a website to ensure that website is safe to visit. To preserve privacy, the client cannot reveal the website it is auditing.

    The researchers’ fastest technique was able to successfully preserve privacy while running at about 10 gigabytes per second. Previous schemes could only achieve a throughput of about 300 megabytes per second.

    They show that their method approaches the theoretical speed limit for private information retrieval — it is nearly the fastest possible scheme one can build in which the server touches every record in the database, adds Corrigan-Gibbs.

    In addition, their method only requires a single server, making it much simpler than many top-performing techniques that require two separate servers with identical databases. Their method outperformed these more complex protocols.

    “I’ve been thinking about these schemes for some time, and I never thought this could be possible at this speed. The folklore was that any single-server scheme is going to be really slow. This work turns that whole notion on its head,” Corrigan-Gibbs says.

    While the researchers have shown that they can make PIR schemes much faster, there is still work to do before they would be able to deploy their techniques in real-world scenarios, says Henzinger. They would like to cut the communication costs of their schemes while still enabling them to achieve high speeds. In addition, they want to adapt their techniques to handle more complex queries, such as general SQL queries, and more demanding applications, such as a general Wikipedia search. And in the long run, they hope to develop better techniques that can preserve privacy without requiring a server to touch every database item. 

    “I’ve heard people emphatically claiming that PIR will never be practical. But I would never bet against technology. That is an optimistic lesson to learn from this work. There are always ways to innovate,” Vaikuntanathan says.

    “This work makes a major improvement to the practical cost of private information retrieval. While it was known that low-bandwidth PIR schemes imply public-key cryptography, which is typically orders of magnitude slower than private-key cryptography, this work develops an ingenious method to bridge the gap. This is done by making a clever use of special properties of a public-key encryption scheme due to Regev to push the vast majority of the computational work to a precomputation step, in which the server computes a short ‘hint’ about the database,” says Yuval Ishai, a professor of computer science at Technion (the Israel Institute of Technology), who was not involved in the study. “What makes their approach particularly appealing is that the same hint can be used an unlimited number of times, by any number of clients. This renders the (moderate) cost of computing the hint insignificant in a typical scenario where the same database is accessed many times.”

    This work is funded, in part, by the National Science Foundation, Google, Facebook, MIT’s Fintech@CSAIL Initiative, an NSF Graduate Research Fellowship, an EECS Great Educators Fellowship, the National Institutes of Health, the Defense Advanced Research Projects Agency, the MIT-IBM Watson AI Lab, Analog Devices, Microsoft, and a Thornton Family Faculty Research Innovation Fellowship. More

  • in

    MIT Policy Hackathon produces new solutions for technology policy challenges

    Almost three years ago, the Covid-19 pandemic changed the world. Many are still looking to uncover a “new normal.”

    “Instead of going back to normal, [there’s a new generation that] wants to build back something different, something better,” says Jorge Sandoval, a second-year graduate student in MIT’s Technology and Policy Program (TPP) at the Institute for Data, Systems and Society (IDSS). “How do we communicate this mindset to others, that the world cannot be the same as before?”

    This was the inspiration behind “A New (Re)generation,” this year’s theme for the IDSS-student-run MIT Policy Hackathon, which Sandoval helped to organize as the event chair. The Policy Hackathon is a weekend-long, interdisciplinary competition that brings together participants from around the globe to explore potential solutions to some of society’s greatest challenges. 

    Unlike other competitions of its kind, Sandoval says MIT’s event emphasizes a humanistic approach. “The idea of our hackathon is to promote applications of technology that are humanistic or human-centered,” he says. “We take the opportunity to examine aspects of technology in the spaces where they tend to interact with society and people, an opportunity most technical competitions don’t offer because their primary focus is on the technology.”

    The competition started with 50 teams spread across four challenge categories. This year’s categories included Internet and Cybersecurity, Environmental Justice, Logistics, and Housing and City Planning. While some people come into the challenge with friends, Sandoval said most teams form organically during an online networking meeting hosted by MIT.

    “We encourage people to pair up with others outside of their country and to form teams of different diverse backgrounds and ages,” Sandoval says. “We try to give people who are often not invited to the decision-making table the opportunity to be a policymaker, bringing in those with backgrounds in not only law, policy, or politics, but also medicine, and people who have careers in engineering or experience working in nonprofits.”

    Once an in-person event, the Policy Hackathon has gone through its own regeneration process these past three years, according to Sandoval. After going entirely online during the pandemic’s height, last year they successfully hosted the first hybrid version of the event, which served as their model again this year.

    “The hybrid version of the event gives us the opportunity to allow people to connect in a way that is lost if it is only online, while also keeping the wide range of accessibility, allowing people to join from anywhere in the world, regardless of nationality or income, to provide their input,” Sandoval says.

    For Swetha Tadisina, an undergraduate computer science major at Lafayette College and participant in the internet and cybersecurity category, the hackathon was a unique opportunity to meet and work with people much more advanced in their careers. “I was surprised how such a diverse team that had never met before was able to work so efficiently and creatively,” Tadisina says.

    Erika Spangler, a public high school teacher from Massachusetts and member of the environmental justice category’s winning team, says that while each member of “Team Slime Mold” came to the table with a different set of skills, they managed to be in sync from the start — even working across the nine-and-a-half-hour time difference the four-person team faced when working with policy advocate Shruti Nandy from Calcutta, India.

    “We divided the project into data, policy, and research and trusted each other’s expertise,” Spangler says, “Despite having separate areas of focus, we made sure to have regular check-ins to problem-solve and cross-pollinate ideas.”

    During the 48-hour period, her team proposed the creation of an algorithm to identify high-quality brownfields that could be cleaned up and used as sites for building renewable energy. Their corresponding policy sought to mandate additional requirements for renewable energy businesses seeking tax credits from the Inflation Reduction Act.

    “Their policy memo had the most in-depth technical assessment, including deep dives in a few key cities to show the impact of their proposed approach for site selection at a very granular level,” says Amanda Levin, director of policy analysis for the Natural Resources Defense Council (NRDC). Levin acted as both a judge and challenge provider for the environmental justice category.

    “They also presented their policy recommendations in the memo in a well-thought-out way, clearly noting the relevant actor,” she adds. This clarity around what can be done, and who would be responsible for those actions, is highly valuable for those in policy.”

    Levin says the NRDC, one of the largest environmental nonprofits in the United States, provided five “challenge questions,” making it clear that teams did not need to address all of them. She notes that this gave teams significant leeway, bringing a wide variety of recommendations to the table. 

    “As a challenge partner, the work put together by all the teams is already being used to help inform discussions about the implementation of the Inflation Reduction Act,” Levin says. “Being able to tap into the collective intelligence of the hackathon helped uncover new perspectives and policy solutions that can help make an impact in addressing the important policy challenges we face today.”

    While having partners with experience in data science and policy definitely helped, fellow Team Slime Mold member Sara Sheffels, a PhD candidate in MIT’s biomaterials program, says she was surprised how much her experiences outside of science and policy were relevant to the challenge: “My experience organizing MIT’s Graduate Student Union shaped my ideas about more meaningful community involvement in renewables projects on brownfields. It is not meaningful to merely educate people about the importance of renewables or ask them to sign off on a pre-planned project without addressing their other needs.”

    “I wanted to test my limits, gain exposure, and expand my world,” Tadisina adds. “The exposure, friendships, and experiences you gain in such a short period of time are incredible.”

    For Willy R. Vasquez, an electrical and computer engineering PhD student at the University of Texas, the hackathon is not to be missed. “If you’re interested in the intersection of tech, society, and policy, then this is a must-do experience.” More

  • in

    Researchers discover major roadblock in alleviating network congestion

    When users want to send data over the internet faster than the network can handle, congestion can occur — the same way traffic congestion snarls the morning commute into a big city.

    Computers and devices that transmit data over the internet break the data down into smaller packets and use a special algorithm to decide how fast to send those packets. These congestion control algorithms seek to fully discover and utilize available network capacity while sharing it fairly with other users who may be sharing the same network. These algorithms try to minimize delay caused by data waiting in queues in the network.

    Over the past decade, researchers in industry and academia have developed several algorithms that attempt to achieve high rates while controlling delays. Some of these, such as the BBR algorithm developed by Google, are now widely used by many websites and applications.

    But a team of MIT researchers has discovered that these algorithms can be deeply unfair. In a new study, they show there will always be a network scenario where at least one sender receives almost no bandwidth compared to other senders; that is, a problem known as “starvation” cannot be avoided.

    “What is really surprising about this paper and the results is that when you take into account the real-world complexity of network paths and all the things they can do to data packets, it is basically impossible for delay-controlling congestion control algorithms to avoid starvation using current methods,” says Mohammad Alizadeh, associate professor of electrical engineering and computer science (EECS).

    While Alizadeh and his co-authors weren’t able to find a traditional congestion control algorithm that could avoid starvation, there may be algorithms in a different class that could prevent this problem. Their analysis also suggests that changing how these algorithms work, so that they allow for larger variations in delay, could help prevent starvation in some network situations.

    Alizadeh wrote the paper with first author and EECS graduate student Venkat Arun and senior author Hari Balakrishnan, the Fujitsu Professor of Computer Science and Artificial Intelligence. The research will be presented at the ACM Special Interest Group on Data Communications (SIGCOMM) conference.

    Controlling congestion

    Congestion control is a fundamental problem in networking that researchers have been trying to tackle since the 1980s.

    A user’s computer does not know how fast to send data packets over the network because it lacks information, such as the quality of the network connection or how many other senders are using the network. Sending packets too slowly makes poor use of the available bandwidth. But sending them too quickly can overwhelm the network, and in doing so, packets will start to get dropped. These packets must be resent, which leads to longer delays. Delays can also be caused by packets waiting in queues for a long time.

    Congestion control algorithms use packet losses and delays as signals to infer congestion and decide how fast to send data. But the internet is complicated, and packets can be delayed and lost for reasons unrelated to network congestion. For instance, data could be held up in a queue along the way and then released with a burst of other packets, or the receiver’s acknowledgement might be delayed. The authors call delays that are not caused by congestion “jitter.”

    Even if a congestion control algorithm measures delay perfectly, it can’t tell the difference between delay caused by congestion and delay caused by jitter. Delay caused by jitter is unpredictable and confuses the sender. Because of this ambiguity, users start estimating delay differently, which causes them to send packets at unequal rates. Eventually, this leads to a situation where starvation occurs and someone gets shut out completely, Arun explains.

    “We started the project because we lacked a theoretical understanding of congestion control behavior in the presence of jitter. To place it on a firmer theoretical footing, we built a mathematical model that was simple enough to think about, yet able to capture some of the complexities of the internet. It has been very rewarding to have math tell us things we didn’t know and that have practical relevance,” he says.

    Studying starvation

    The researchers fed their mathematical model to a computer, gave it a series of commonly used congestion control algorithms, and asked the computer to find an algorithm that could avoid starvation, using their model.

    “We couldn’t do it. We tried every algorithm that we are aware of, and some new ones we made up. Nothing worked. The computer always found a situation where some people get all the bandwidth and at least one person gets basically nothing,” Arun says.

    The researchers were surprised by this result, especially since these algorithms are widely believed to be reasonably fair. They started suspecting that it may not be possible to avoid starvation, an extreme form of unfairness. This motivated them to define a class of algorithms they call “delay-convergent algorithms” that they proved will always suffer from starvation under their network model. All existing congestion control algorithms that control delay (that the researchers are aware of) are delay-convergent.

    The fact that such simple failure modes of these widely used algorithms remained unknown for so long illustrates how difficult it is to understand algorithms through empirical testing alone, Arun adds. It underscores the importance of a solid theoretical foundation.

    But all hope is not lost. While all the algorithms they tested failed, there may be other algorithms which are not delay-convergent that might be able to avoid starvation This suggests that one way to fix the problem might be to design congestion control algorithms that vary the delay range more widely, so the range is larger than any delay that might occur due to jitter in the network.

    “To control delays, algorithms have tried to also bound the variations in delay about a desired equilibrium, but there is nothing wrong in potentially creating greater delay variation to get better measurements of congestive delays. It is just a new design philosophy you would have to adopt,” Balakrishnan adds.

    Now, the researchers want to keep pushing to see if they can find or build an algorithm that will eliminate starvation. They also want to apply this approach of mathematical modeling and computational proofs to other thorny, unsolved problems in networked systems.

    “We are increasingly reliant on computer systems for very critical things, and we need to put their reliability on a firmer conceptual footing. We’ve shown the surprising things you can discover when you put in the time to come up with these formal specifications of what the problem actually is,” says Alizadeh.

    The NASA University Leadership Initiative (grant #80NSSC20M0163) provided funds to assist the authors with their research, but the research paper solely reflects the opinions and conclusions of its authors and not any NASA entity. This work was also partially funded by the National Science Foundation, award number 1751009. More

  • in

    Data flow’s decisive role on the global stage

    In 2016, Meicen Sun came to a profound realization: “The control of digital information will lie at the heart of all the big questions and big contentions in politics.” A graduate student in her final year of study who is specializing in international security and the political economy of technology, Sun vividly recalls the emergence of the internet “as a democratizing force, an opener, an equalizer,” helping give rise to the Arab Spring. But she was also profoundly struck when nations in the Middle East and elsewhere curbed internet access to throttle citizens’ efforts to speak and mobilize freely.

    During her undergraduate and graduate studies, which came to focus on China and its expanding global role, Sun became convinced that digital constraints initially intended to prevent the free flow of ideas were also having enormous and growing economic impacts.

    “With an exceptionally high mobile internet adoption rate and the explosion of indigenous digital apps, China’s digital economy was surging, helping to drive the nation’s broader economic growth and international competitiveness,” Sun says. “Yet at the same time, the country maintained the most tightly controlled internet ecosystem in the world.”

    Sun set out to explore this apparent paradox in her dissertation. Her research to date has yielded both novel findings and troubling questions.  

    “Through its control of the internet, China has in effect provided protectionist benefits to its own data-intensive domestic sectors,” she says. “If there is a benefit to imposing internet control, given the absence of effective international regulations, does this give authoritarian states an advantage in trade and national competitiveness?” Following this thread, Sun asks, “What might this mean for the future of democracy as the world grows increasingly dependent on digital technology?”

    Protect or innovate

    Early in her graduate program, classes in capitalism and technology and public policy, says Sun, “cemented for me the idea of data as a factor of production, and the importance of cross-border information flow in making a country innovative.” This central premise serves as a springboard for Sun’s doctoral studies.

    In a series of interconnected research papers using China as her primary case, she is examining the double-edged nature of internet limits. “They accord protectionist benefits to domestic data-internet-intensive sectors, on the one hand, but on the other, act as a potential longer-term deterrent to the country’s capacity to innovate.”

    To pursue her doctoral project, advised by professor of political science Kenneth Oye, Sun is extracting data from a multitude of sources, including a website that has been routinely testing web domain accessibility from within China since 2011. This allows her to pin down when and to what degree internet control occurs. She can then compare this information to publicly available records on the expansion or contraction of data-intensive industrial sectors, enabling her to correlate internet control to a sector’s performance.

    Sun has also compiled datasets for firm-level revenue, scientific citations, and patents that permit her to measure aspects of China’s innovation culture. In analyzing her data she leverages both quantitative and qualitative methods, including one co-developed by her dissertation co-advisor, associate professor of political science In Song Kim. Her initial analysis suggests internet control prevents scholars from accessing knowledge available on foreign websites, and that if sustained, such control could take a toll on the Chinese economy over time.

    Of particular concern is the possibility that the economic success that flows from strict internet controls, as exemplified by the Chinese model, may encourage the rise of similar practices among emerging states or those in political flux.

    “The grim implication of my research is that without international regulation on information flow restrictions, democracies will be at a disadvantage against autocracies,” she says. “No matter how short-term or narrow these curbs are, they confer concrete benefits on certain economic sectors.”

    Data, politics, and economy

    Sun got a quick start as a student of China and its role in the world. She was born in Xiamen, a coastal Chinese city across from Taiwan, to academic parents who cultivated her interest in international politics. “My dad would constantly talk to me about global affairs, and he was passionate about foreign policy,” says Sun.

    Eager for education and a broader view of the world, Sun took a scholarship at 15 to attend school in Singapore. “While this experience exposed me to a variety of new ideas and social customs, I felt the itch to travel even farther away, and to meet people with different backgrounds and viewpoints from mine,” than she says.

    Sun attended Princeton University where, after two years sticking to her “comfort zone” — writing and directing plays and composing music for them — she underwent a process of intellectual transition. Political science classes opened a window onto a larger landscape to which she had long been connected: China’s behavior as a rising power and the shifting global landscape.

    She completed her undergraduate degree in politics, and followed up with a master’s degree in international relations at the University of Pennsylvania, where she focused on China-U.S. relations and China’s participation in international institutions. She was on the path to completing a PhD at Penn when, Sun says, “I became confident in my perception that digital technology, and especially information sharing, were becoming critically important factors in international politics, and I felt a strong desire to devote my graduate studies, and even my career, to studying these topics,”

    Certain that the questions she hoped to pursue could best be addressed through an interdisciplinary approach with those working on similar issues, Sun began her doctoral program anew at MIT.

    “Doer mindset”

    Sun is hopeful that her doctoral research will prove useful to governments, policymakers, and business leaders. “There are a lot of developing states actively shopping between data governance and development models for their own countries,” she says. “My findings around the pros and cons of information flow restrictions should be of interest to leaders in these places, and to trade negotiators and others dealing with the global governance of data and what a fair playing field for digital trade would be.”

    Sun has engaged directly with policy and industry experts through her fellowships with the World Economic Forum and the Pacific Forum. And she has embraced questions that touch on policy outside of her immediate research: Sun is collaborating with her dissertation co-advisor, MIT Sloan Professor Yasheng Huang, on a study of the political economy of artificial intelligence in China for the MIT Task Force on the Work of the Future.

    This year, as she writes her dissertation papers, Sun will be based at Georgetown University, where she has a Mortara Center Global Political Economy Project Predoctoral Fellowship. In Washington, she will continue her journey to becoming a “policy-minded scholar, a thinker with a doer mindset, whose findings have bearing on things that happen in the world.” More