More stories

  • in

    A universal system for decoding any type of data sent across a network

    Every piece of data that travels over the internet — from paragraphs in an email to 3D graphics in a virtual reality environment — can be altered by the noise it encounters along the way, such as electromagnetic interference from a microwave or Bluetooth device. The data are coded so that when they arrive at their destination, a decoding algorithm can undo the negative effects of that noise and retrieve the original data.

    Since the 1950s, most error-correcting codes and decoding algorithms have been designed together. Each code had a structure that corresponded with a particular, highly complex decoding algorithm, which often required the use of dedicated hardware.

    Researchers at MIT, Boston University, and Maynooth University in Ireland have now created the first silicon chip that is able to decode any code, regardless of its structure, with maximum accuracy, using a universal decoding algorithm called Guessing Random Additive Noise Decoding (GRAND). By eliminating the need for multiple, computationally complex decoders, GRAND enables increased efficiency that could have applications in augmented and virtual reality, gaming, 5G networks, and connected devices that rely on processing a high volume of data with minimal delay.

    The research at MIT is led by Muriel Médard, the Cecil H. and Ida Green Professor in the Department of Electrical Engineering and Computer Science, and was co-authored by Amit Solomon and Wei Ann, both graduate students at MIT; Rabia Tugce Yazicigil, assistant professor of electrical and computer engineering at Boston University; Arslan Riaz and Vaibhav Bansal, both graduate students at Boston University; Ken R. Duffy, director of the Hamilton Institute at the National University of Ireland at Maynooth; and Kevin Galligan, a Maynooth graduate student. The research will be presented at the European Solid-States Device Research and Circuits Conference next week.

    Focus on noise

    One way to think of these codes is as redundant hashes (in this case, a series of 1s and 0s) added to the end of the original data. The rules for the creation of that hash are stored in a specific codebook.

    As the encoded data travel over a network, they are affected by noise, or energy that disrupts the signal, which is often generated by other electronic devices. When that coded data and the noise that affected them arrive at their destination, the decoding algorithm consults its codebook and uses the structure of the hash to guess what the stored information is.

    Instead, GRAND works by guessing the noise that affected the message, and uses the noise pattern to deduce the original information. GRAND generates a series of noise sequences in the order they are likely to occur, subtracts them from the received data, and checks to see if the resulting codeword is in a codebook.

    While the noise appears random in nature, it has a probabilistic structure that allows the algorithm to guess what it might be.

    “In a way, it is similar to troubleshooting. If someone brings their car into the shop, the mechanic doesn’t start by mapping the entire car to blueprints. Instead, they start by asking, ‘What is the most likely thing to go wrong?’ Maybe it just needs gas. If that doesn’t work, what’s next? Maybe the battery is dead?” Médard says.

    Novel hardware

    The GRAND chip uses a three-tiered structure, starting with the simplest possible solutions in the first stage and working up to longer and more complex noise patterns in the two subsequent stages. Each stage operates independently, which increases the throughput of the system and saves power.

    The device is also designed to switch seamlessly between two codebooks. It contains two static random-access memory chips, one that can crack codewords, while the other loads a new codebook and then switches to decoding without any downtime.

    The researchers tested the GRAND chip and found it could effectively decode any moderate redundancy code up to 128 bits in length, with only about a microsecond of latency.

    Médard and her collaborators had previously demonstrated the success of the algorithm, but this new work showcases the effectiveness and efficiency of GRAND in hardware for the first time.

    Developing hardware for the novel decoding algorithm required the researchers to first toss aside their preconceived notions, Médard says.

    “We couldn’t go out and reuse things that had already been done. This was like a complete whiteboard. We had to really think about every single component from scratch. It was a journey of reconsideration. And I think when we do our next chip, there will be things with this first chip that we’ll realize we did out of habit or assumption that we can do better,” she says.

    A chip for the future

    Since GRAND only uses codebooks for verification, the chip not only works with legacy codes but could also be used with codes that haven’t even been introduced yet.

    In the lead-up to 5G implementation, regulators and communications companies struggled to find consensus as to which codes should be used in the new network. Regulators ultimately chose to use two types of traditional codes for 5G infrastructure in different situations. Using GRAND could eliminate the need for that rigid standardization in the future, Médard says.

    The GRAND chip could even open the field of coding to a wave of innovation.

    “For reasons I’m not quite sure of, people approach coding with awe, like it is black magic. The process is mathematically nasty, so people just use codes that already exist. I’m hoping this will recast the discussion so it is not so standards-oriented, enabling people to use codes that already exist and create new codes,” she says.

    Moving forward, Médard and her collaborators plan to tackle the problem of soft detection with a retooled version of the GRAND chip. In soft detection, the received data are less precise.

    They also plan to test the ability of GRAND to crack longer, more complex codes and adjust the structure of the silicon chip to improve its energy efficiency.

    The research was funded by the Battelle Memorial Institute and Science Foundation of Ireland. More

  • in

    Lincoln Laboratory convenes top network scientists for Graph Exploitation Symposium

    As the Covid-19 pandemic has shown, we live in a richly connected world, facilitating not only the efficient spread of a virus but also of information and influence. What can we learn by analyzing these connections? This is a core question of network science, a field of research that models interactions across physical, biological, social, and information systems to solve problems.

    The 2021 Graph Exploitation Symposium (GraphEx), hosted by MIT Lincoln Laboratory, brought together top network science researchers to share the latest advances and applications in the field.

    “We explore and identify how exploitation of graph data can offer key technology enablers to solve the most pressing problems our nation faces today,” says Edward Kao, a symposium organizer and technical staff in Lincoln Laboratory’s AI Software Architectures and Algorithms Group.

    The themes of the virtual event revolved around some of the year’s most relevant issues, such as analyzing disinformation on social media, modeling the pandemic’s spread, and using graph-based machine learning models to speed drug design.

    “The special sessions on influence operations and Covid-19 at GraphEx reflect the relevance of network and graph-based analysis for understanding the phenomenology of these complicated and impactful aspects of modern-day life, and also may suggest paths forward as we learn more and more about graph manipulation,” says William Streilein, who co-chaired the event with Rajmonda Caceres, both of Lincoln Laboratory.

    Social networks

    Several presentations at the symposium focused on the role of network science in analyzing influence operations (IO), or organized attempts by state and/or non-state actors to spread disinformation narratives.  

    Lincoln Laboratory researchers have been developing tools to classify and quantify the influence of social media accounts that are likely IO accounts, such as those willfully spreading false Covid-19 treatments to vulnerable populations.

    “A cluster of IO accounts acts as an echo chamber to amplify the narrative. The vulnerable population is then engaging in these narratives,” says Erika Mackin, a researcher developing the tool, called RIO or Reconnaissance of Influence Operations.

    To classify IO accounts, Mackin and her team trained an algorithm to detect probable IO accounts in Twitter networks based on a specific hashtag or narrative. One example they studied was #MacronLeaks, a disinformation campaign targeting Emmanuel Macron during the 2017 French presidential election. The algorithm is trained to label accounts within this network as being IO on the basis of several factors, such as the number of interactions with foreign news accounts, the number of links tweeted, or number of languages used. Their model then uses a statistical approach to score an account’s level of influence in spreading the narrative within that network.

    The team has found that their classifier outperforms existing detectors of IO accounts, because it can identify both bot accounts and human-operated ones. They’ve also discovered that IO accounts that pushed the 2017 French election disinformation narrative largely overlap with accounts influentially spreading Covid-19 pandemic disinformation today. “This suggests that these accounts will continue to transition to disinformation narratives,” Mackin says.

    Pandemic modeling

    Throughout the Covid-19 pandemic, leaders have been looking to epidemiological models, which predict how disease will spread, to make sound decisions. Alessandro Vespignani, director of the Network Science Institute at Northeastern University, has been leading Covid-19 modeling efforts in the United States, and shared a keynote on this work at the symposium.

    Besides taking into account the biological facts of the disease, such as its incubation period, Vespignani’s model is especially powerful in its inclusion of community behavior. To run realistic simulations of disease spread, he develops “synthetic populations” that are built by using publicly available, highly detailed datasets about U.S. households. “We create a population that is not real, but is statistically real, and generate a map of the interactions of those individuals,” he says. This information feeds back into the model to predict the spread of the disease. 

    Today, Vespignani is considering how to integrate genomic analysis of the virus into this kind of population modeling in order to understand how variants are spreading. “It’s still a work in progress that is extremely interesting,” he says, adding that this approach has been useful in modeling the dispersal of the Delta variant of SARS-CoV-2. 

    As researchers model the virus’ spread, Lucas Laird at Lincoln Laboratory is considering how network science can be used to design effective control strategies. He and his team are developing a model for customizing strategies for different geographic regions. The effort was spurred by the differences in Covid-19 spread across U.S. communities, and what the researchers found to be a gap in intervention modeling to address those differences.

    As examples, they applied their planning algorithm to three counties in Florida, Massachusetts, and California. Taking into account the characteristics of a specific geographic center, such as the number of susceptible individuals and number of infections there, their planner institutes different strategies in those communities throughout the outbreak duration.

    “Our approach eradicates disease in 100 days, but it also is able to do it with much more targeted interventions than any of the global interventions. In other words, you don’t have to shut down a full country.” Laird adds that their planner offers a “sandbox environment” for exploring intervention strategies in the future.

    Machine learning with graphs

    Graph-based machine learning is receiving increasing attention for its potential to “learn” the complex relationships between graphical data, and thus extract new insights or predictions about these relationships. This interest has given rise to a new class of algorithms called graph neural networks. Today, graph neural networks are being applied in areas such as drug discovery and material design, with promising results.

    “We can now apply deep learning much more broadly, not only to medical images and biological sequences. This creates new opportunities in data-rich biology and medicine,” says Marinka Zitnik, an assistant professor at Harvard University who presented her research at GraphEx.

    Zitnik’s research focuses on the rich networks of interactions between proteins, drugs, disease, and patients, at the scale of billions of interactions. One application of this research is discovering drugs to treat diseases with no or few approved drug treatments, such as for Covid-19. In April, Zitnik’s team published a paper on their research that used graph neural networks to rank 6,340 drugs for their expected efficacy against SARS-CoV-2, identifying four that could be repurposed to treat Covid-19.

    At Lincoln Laboratory, researchers are similarly applying graph neural networks to the challenge of designing advanced materials, such as those that can withstand extreme radiation or capture carbon dioxide. Like the process of designing drugs, the trial-and-error approach to materials design is time-consuming and costly. The laboratory’s team is developing graph neural networks that can learn relationships between a material’s crystalline structure and its properties. This network can then be used to predict a variety of properties from any new crystal structure, greatly speeding up the process of screening materials with desired properties for specific applications.

    “Graph representation learning has emerged as a rich and thriving research area for incorporating inductive bias and structured priors during the machine learning process, with broad applications such as drug design, accelerated scientific discovery, and personalized recommendation systems,” Caceres says. 

    A vibrant community

    Lincoln Laboratory has hosted the GraphEx Symposium annually since 2010, with the exception of last year’s cancellation due to Covid-19. “One key takeaway is that despite the postponement from last year and the need to be virtual, the GraphEx community is as vibrant and active as it’s ever been,” Streilein says. “Network-based analysis continues to expand its reach and is applied to ever-more important areas of science, society, and defense with increasing impact.”

    In addition to those from Lincoln Laboratory, technical committee members and co-chairs of the GraphEx Symposium included researchers from Harvard University, Arizona State University, Stanford University, Smith College, Duke University, the U.S. Department of Defense, and Sandia National Laboratories. More