Markus Andrews, Autore presso technology-news.space - All about the world of technology!

More stories

113 Shares149 Views
in Data Management & Statistics
Could all your digital photos be stored as DNA?
10 June 2021, 15:00
On Earth right now, there are about 10 trillion gigabytes of digital data, and every day, humans produce emails, photos, tweets, and other digital files that add up to another 2.5 million gigabytes of data. Much of this data is stored in enormous facilities known as exabyte data centers (an exabyte is 1 billion gigabytes), which can be the size of several football fields and cost around $1 billion to build and maintain.
Many scientists believe that an alternative solution lies in the molecule that contains our genetic information: DNA, which evolved to store massive quantities of information at very high density. A coffee mug full of DNA could theoretically store all of the world’s data, says Mark Bathe, an MIT professor of biological engineering.
“We need new solutions for storing these massive amounts of data that the world is accumulating, especially the archival data,” says Bathe, who is also an associate member of the Broad Institute of MIT and Harvard. “DNA is a thousandfold denser than even flash memory, and another property that’s interesting is that once you make the DNA polymer, it doesn’t consume any energy. You can write the DNA and then store it forever.”
Scientists have already demonstrated that they can encode images and pages of text as DNA. However, an easy way to pick out the desired file from a mixture of many pieces of DNA will also be needed. Bathe and his colleagues have now demonstrated one way to do that, by encapsulating each data file into a 6-micrometer particle of silica, which is labeled with short DNA sequences that reveal the contents.
Using this approach, the researchers demonstrated that they could accurately pull out individual images stored as DNA sequences from a set of 20 images. Given the number of possible labels that could be used, this approach could scale up to 1020 files.
Bathe is the senior author of the study, which appears today in Nature Materials. The lead authors of the paper are MIT senior postdoc James Banal, former MIT research associate Tyson Shepherd, and MIT graduate student Joseph Berleant.
Stable storage
Digital storage systems encode text, photos, or any other kind of information as a series of 0s and 1s. This same information can be encoded in DNA using the four nucleotides that make up the genetic code: A, T, G, and C. For example, G and C could be used to represent 0 while A and T represent 1.
DNA has several other features that make it desirable as a storage medium: It is extremely stable, and it is fairly easy (but expensive) to synthesize and sequence. Also, because of its high density — each nucleotide, equivalent to up to two bits, is about 1 cubic nanometer — an exabyte of data stored as DNA could fit in the palm of your hand.
One obstacle to this kind of data storage is the cost of synthesizing such large amounts of DNA. Currently it would cost $1 trillion to write one petabyte of data (1 million gigabytes). To become competitive with magnetic tape, which is often used to store archival data, Bathe estimates that the cost of DNA synthesis would need to drop by about six orders of magnitude. Bathe says he anticipates that will happen within a decade or two, similar to how the cost of storing information on flash drives has dropped dramatically over the past couple of decades.
Aside from the cost, the other major bottleneck in using DNA to store data is the difficulty in picking out the file you want from all the others.
“Assuming that the technologies for writing DNA get to a point where it’s cost-effective to write an exabyte or zettabyte of data in DNA, then what? You’re going to have a pile of DNA, which is a gazillion files, images or movies and other stuff, and you need to find the one picture or movie you’re looking for,” Bathe says. “It’s like trying to find a needle in a haystack.”
Currently, DNA files are conventionally retrieved using PCR (polymerase chain reaction). Each DNA data file includes a sequence that binds to a particular PCR primer. To pull out a specific file, that primer is added to the sample to find and amplify the desired sequence. However, one drawback to this approach is that there can be crosstalk between the primer and off-target DNA sequences, leading unwanted files to be pulled out. Also, the PCR retrieval process requires enzymes and ends up consuming most of the DNA that was in the pool.
“You’re kind of burning the haystack to find the needle, because all the other DNA is not getting amplified and you’re basically throwing it away,” Bathe says.
File retrieval
As an alternative approach, the MIT team developed a new retrieval technique that involves encapsulating each DNA file into a small silica particle. Each capsule is labeled with single-stranded DNA “barcodes” that correspond to the contents of the file. To demonstrate this approach in a cost-effective manner, the researchers encoded 20 different images into pieces of DNA about 3,000 nucleotides long, which is equivalent to about 100 bytes. (They also showed that the capsules could fit DNA files up to a gigabyte in size.)
Each file was labeled with barcodes corresponding to labels such as “cat” or “airplane.” When the researchers want to pull out a specific image, they remove a sample of the DNA and add primers that correspond to the labels they’re looking for — for example, “cat,” “orange,” and “wild” for an image of a tiger, or “cat,” “orange,” and “domestic” for a housecat.
The primers are labeled with fluorescent or magnetic particles, making it easy to pull out and identify any matches from the sample. This allows the desired file to be removed while leaving the rest of the DNA intact to be put back into storage. Their retrieval process allows Boolean logic statements such as “president AND 18th century” to generate George Washington as a result, similar to what is retrieved with a Google image search.
“At the current state of our proof-of-concept, we’re at the 1 kilobyte per second search rate. Our file system’s search rate is determined by the data size per capsule, which is currently limited by the prohibitive cost to write even 100 megabytes worth of data on DNA, and the number of sorters we can use in parallel. If DNA synthesis becomes cheap enough, we would be able to maximize the data size we can store per file with our approach,” Banal says.
For their barcodes, the researchers used single-stranded DNA sequences from a library of 100,000 sequences, each about 25 nucleotides long, developed by Stephen Elledge, a professor of genetics and medicine at Harvard Medical School. If you put two of these labels on each file, you can uniquely label 1010 (10 billion) different files, and with four labels on each, you can uniquely label 1020 files.
George Church, a professor of genetics at Harvard Medical School, describes the technique as “a giant leap for knowledge management and search tech.”
“The rapid progress in writing, copying, reading, and low-energy archival data storage in DNA form has left poorly explored opportunities for precise retrieval of data files from huge (1021 byte, zetta-scale) databases,” says Church, who was not involved in the study. “The new study spectacularly addresses this using a completely independent outer layer of DNA and leveraging different properties of DNA (hybridization rather than sequencing), and moreover, using existing instruments and chemistries.”
Bathe envisions that this kind of DNA encapsulation could be useful for storing “cold” data, that is, data that is kept in an archive and not accessed very often. His lab is spinning out a startup, Cache DNA, that is now developing technology for long-term storage of DNA, both for DNA data storage in the long-term, and clinical and other preexisting DNA samples in the near-term.
“While it may be a while before DNA is viable as a data storage medium, there already exists a pressing need today for low-cost, massive storage solutions for preexisting DNA and RNA samples from Covid-19 testing, human genomic sequencing, and other areas of genomics,” Bathe says.
The research was funded by the Office of Naval Research, the National Science Foundation, and the U.S. Army Research Office. More
100 Shares119 Views
in Data Management & Statistics
Taking an indirect path into a bright future
28 May 2021, 16:30
Matthew Johnston was a physics senior looking to postpone his entry into adulting. He had an intense four years at MIT; when he wasn’t in class, he was playing baseball and working various tech development gigs.
Johnston had led the MIT Engineers baseball team to a conference championship, becoming the first player in his team’s history to be named a three-time Google Cloud Academic All-American. He put an exclamation mark on his career by hitting four home runs in his final game.
Johnston also developed a novel method of producing solar devices as a researcher with GridEdge Solar at MIT, and worked on a tax-loss harvesting research project as an intern at Impact Labs in San Francisco, California. As he contemplated post-graduation life, he liked the idea of gaining new experiences before committing to a company.
Remotely Down Under
MISTI-Australia matched him with an internship at Sydney-based Okra Solar, which manufactures smart solar charge controllers in Shenzhen, China, to help power off-the-grid remote villages in Southeast Asian countries such as Cambodia and the Philippines, as well as in Nigeria.
“I felt that I had so much more to learn before committing to a full-time job, and I wanted to see the world,” he says. “Working an internship for Okra in Sydney seemed like it would be the perfect buffer between university life and life in the real world. If all went well, maybe I would end up living in Sydney a while longer.”
After graduating in May 2020 with a BS in physics, a minor in computer science, and a concentration in philosophy, he prepared to live in Sydney, with the possibility of travel to Shenzhen, when he received a familiar pitch: a curveball.
Like everyone else, he had hoped that the pandemic would wind down before his Down Under move, but when that didn’t happen, he pivoted to sharing a place with friends in Southern California, where they could hike and camp in nearby Sequoia National Park when they weren’t working remotely.
On Okra’s software team, he focused on data science to streamline the maintenance and improve the reliability of Okra’s solar energy systems. However, his remote status didn’t mesh with an ongoing project to identify remote villages without grid access. So, he launched his own data project: designing a model to identify shaded solar panels based on their daily power output. That project was placed on hold until they could get more reliable data, but he gained experience setting up machine-learning problems as he developed a pipeline to retrieve, process, and load the data to train the model.
“This project helped me understand that most of the effort in a data science problem goes into sourcing and processing the data. Unfortunately, it seemed that it was just a bit too early for the model to perform accurately.”
Team-powered engine
Coordinating with a team of 23 people from more than 10 unique cultures, scattered across 11 countries in different time zones, presented yet another challenge. He responded by developing a productive workflow by leaving questions in his code reviews that would be answered by the next morning.
“Working remotely is ultimately a bigger barrier to team cohesion than productivity,” he says. He overcame that hurdle as well; the Aussie team took a liking to him and nicknamed him Jonno. “They’re an awesome group to be around and aren’t afraid to laugh at themselves.”
Soon, Jonno was helping the service delivery team efficiently diagnose and resolve real issues in the field using sensor data. By automating the maintenance process in this way, Okra makes it possible for energy companies to deploy and manage last-mile energy projects at scale. Several months later, when he began contributing to the firmware team, he also took on the project of calculating a battery’s state of charge, with the goal to open-source a robust and reliable algorithm.
“Matt excelled despite the circumstance,” says Okra Solar co-founder and CEO Afnan Hannan. “Matt contributed to developing Okra’s automated field alerts system that monitors the health and performance of Okra’s solar systems, which are deployed across Southeast Asia and Africa. Additionally, Matt led the development of a state-of-the-art Kalman filter-based online state-of-charge (SoC) algorithm. This included research, prototyping, developing back-testing infrastructure, and finally implementing and deploying the solution on Okra’s microcontroller. An accurate and stable SoC has been a vital part of Okra’s cutting-edge Battery Sharing feature, for which we have Matt to thank.”
Full power
After six months, Johnston joined Okra full time in January, moving to Phnom Penh, Cambodia, to join some of the team in person and immerse himself into firmware and data science. In the short term, the goal is to electrify villages to provide access to much cheaper and more accessible energy.
“Previously, the only way many of these villages could access electricity was by charging a car battery using a diesel generator,” he says. “This process is very expensive, and it is impossible to charge many batteries simultaneously. In contrast, Okra provides, cheap, accessible, and renewable energy for the entire village.”
For Johnston to see an Okra project firsthand, some villages are a 30-minute boat ride from their nearest town. He and others travel there to demonstrate small appliances that many in the world take for granted, such as using an electric blender to make a smoothie.
“It’s really amazing to see how hard-to-reach these villages are and how much electricity can help them,” says Johnston. “Something as simple as using a rice cooker instead of a wood fire can save a family countless hours of chopping wood. It also helps us think about how we can improve our product, both for the users and the energy companies.”
“In the long term, the vision is that by providing electricity, we can introduce the possibility of online education and more productive uses of power, allowing these communities to join the modern economy.”
While getting to Phnom Penh was a challenge, he credits MIT for hitting yet another home run.
“I think two of the biggest things I learned from both baseball and physics were how to learn challenging things and how to overcome failure. It takes persistence to keep digging for more information and practicing what you’ve already failed, and this same way of thinking has helped me to develop my professional skills. At the same time, I am grateful for the time I spent studying philosophy. Thinking deeply about what might lead to a meaningful life for myself and for others has led me to stumble upon opportunities like this one.” More
138 Shares169 Views
in Data Management & Statistics
MIT baseball coach uses sensors, motion capture technology to teach pitching
26 May 2021, 19:45
The field of sports analytics is most known for assessing player and team performance during competition, but MIT Baseball’s pitching coach, Todd Carroll, is bringing a different kind of analytics to the practice field for his student athletes.
“A baseball player might practice a pitch 10,000 times before it becomes natural. Through technology, we can speed that process up,” Carroll said in a recent seminar organized by the MIT.nano Immersion Lab. “To help players improve athletically, without taking up that much time, and keep them healthy — that’s the goal.”
The virtual talk — “Pitching in baseball: Using scientific tools to visualize what we know and learn what we don’t” — grew out of a new research collaboration between MIT Baseball, the MIT Clinical Research Center (CRC), and the Immersion Lab.
Carroll started with an explanation of how pitching has evolved over time and what specific skills coaches measure to help players perfect their throw. Then, he and Research Laboratory of Electronics (RLE) postdoc Praneeth Namburi used the Immersion Lab’s motion capture platform and wireless physiological sensors from the CRC to explore how biomechanical feedback and interactive visualization tools could change the future of sports.
Namburi stepped up to the (hypothetical) mound, with Carroll as his coach. By interfacing the physical and digital in real time, the two were able to assess Namburi’s pitches and make immediate adjustments that improved his athletic performance in one session.
Visualizing sports data
Stride length, pitcher extension, hip-shoulder separation, and ground force production are all measurable aspects of pitching, explained Carroll. The capabilities of the Immersion Lab allow for digital tracking and visualization of these skills. Wearing wireless sensors on his body, Namburi threw several pitches inside the lab. The sensors plot Namburi’s position and track his movements through space, as shown in the first part of the video below. Adding in the physiological measurements, the second clip shows the activity of his rotation muscles (in green), his acceleration through space (in blue), and the pressure, or ground force, produced by his foot (in red).
Play video
Pitching at the Immersion Lab
By reviewing the motion capture frames together, Carroll could show Namburi how to modify his posture to increase stride length and extend his hip-shoulder separation by holding his back foot on the ground. In this example, the technology betters the communication between coach and player, leading to more efficient improvements.
Assessing physiological measurements alongside the motion capture can also help decrease injuries. Carroll emphasized how this technology can help rehabbing players, teaching them to trust their body again. “That’s a big part of injury recovery, trusting the process. These students find comfort in the data and that allows them to push through.”
Following the training session, Namburi overlayed the motion capture from his first and last throw, comparing his posture, spine position, stride length, and feet position. A visual compilation of all his throws compared the trajectory of his wrist, showing that, over time, his movement became more consistent and more natural.
The seminar concluded with a live demonstration of a novice pitcher in the Immersion Lab following the advice of Coach Carroll via Zoom. “Two people who have never thrown a baseball before today, and we’re able to teach them remotely during a pandemic,” reflected Carroll. “That’s pretty cool.”
Afterward, Namburi answered questions about the ease of taking the physiological monitoring tools to the field and of being able to capture and measure the movements of multiple athletes at once.
Play video
IMMERSED IN: Athletics—Pitching in baseball
Immersed in collaboration
The MIT.nano Immersion Lab’s new seminar series, IMMERSED, explores the possibilities enabled by technologies such as motion capture, virtual and augmented reality, photogrammetry, and related computational advances to gather, process, and interact with data from multiple modalities. The series highlights the capabilities available at the Immersion Lab, and the wide range of disciplines to which the tools and space can be applied.
“IMMERSED offers another avenue for any individual — scientists, artists, engineers, performers — to consider collaborative projects,” says Brian W. Anthony, MIT.nano associate director. “The series combines lectures with demonstrations and tutorials so more people can see the wide breadth of research possible at the lab.”
As a shared-access facility, MIT.nano’s Immersion Lab is open to researchers from any department, lab, or center at MIT, as well as external partners. Learn more about the Immersion Lab and how to become a user. More
113 Shares189 Views
in Data Management & Statistics
Crowdsourcing data on road quality and excess fuel consumption
14 May 2021, 13:00
America has over 4 million miles of roads and, as one might expect, monitoring them can be a monumental task.
To collect high-quality data on the conditions of their roads, departments of transportation (DOTs) can expect to spend $200 per mile for state-of-the-art laser profilers. For cities and states, these costs are prohibitive and often force them to resort to rudimentary approaches, like visual inspection.
Over the past three years, a collaboration between the MIT Concrete Sustainability Hub (CSHub), the University of Massachusetts at Dartmouth, Birzeit University, and the American University of Beirut has sought to give DOTs a cheaper, but equally accurate, alternative.
Their solution, “Carbin,” is an app that allows users to crowdsource road-quality data with their smartphones. An algorithm built into the software can then estimate how that road quality affects a user’s fuel consumption.
Unlike prior road-quality crowdsourcing tools, the Carbin framework is the most sophisticated of its kind. Using the accelerometers found in smartphones, Carbin converts vehicle acceleration signals into standard measurements of road roughness used by most DOTs. It then collates these measurements onto fixmyroad.us, a publicly available global map.
Since its release in 2019, Carbin has gathered almost 600,000 miles of road-quality data in more than three dozen countries. During 2020, its developers continued to advance the app. Not only have they validated their approach in two papers — one in Data-Centric Engineering and another in The Proceedings of the Royal Society — they have also collected more than 300,000 miles of data with the help of Concrete Supply Co., a ready-mix concrete manufacturer in the Carolinas. In addition, they are initiating collaborations with automotive manufacturers and vehicle telematics companies to gather data on even greater scale.
Play video
Roughly speaking
Carbin is not the first phone accelerometer-based approach for crowdsourcing road quality. Several other apps, including the City of Boston’s “Street Bump,” have sought to assess road quality based on one of the most recognizable signs of poor roads: potholes.
Though potholes have been the focus of prior apps, they are, however, not the main metric used by DOTs for measuring road quality and planning maintenance. Instead, DOTs rely on what is called road roughness.
“The shortcoming of previous crowdsourcing approaches is that they would record the acceleration signal and look for outliers, which would indicate potholes,” explains Botshekan. “However, they could not infer the road roughness, since that is defined over longer length scales — typically from tens of centimeters to tens of meters.”
Though roughness can seem almost imperceptible, it can have outsized effects. Rough roads not only lead to higher maintenance costs but can also increase vehicle fuel consumption — by as much as 15 percent in cities. To measure roughness, DOTs use the International Roughness Index (IRI).
“IRI is the accumulated motion of the suspension system over a specific distance,” says Arghavan Louhghalam, an assistant professor of civil and environmental engineering at the University of Massachusetts at Dartmouth. “Higher IRI indicates lower road quality and higher fuel consumption.”
To derive IRI, DOTs don’t actually measure suspension travel explicitly. Instead, they first capture the profile of the road — essentially, the undulations of its surface — and then simulate how a car’s suspension system would respond to it using what’s called a “quarter car model.”
From quarter car to complete picture
A quarter car model is essentially what it sounds like: a model of a quarter of a car. Specifically, it refers to a model of the tires, vehicle mass, and suspension system based on one wheel of a vehicle. By developing their own car dynamics model in a probabilistic setting, Botshekan and his colleagues were able to map the acceleration signals collected by Carbin users onto the behavior of a virtual vehicle and its interaction with the road. From there, they could estimate suspension properties and road roughness in terms of IRI. Using an algorithm developed based on past CSHub research, Carbin then estimates how IRI values can impact vehicle fuel consumption.
“At the end of the day, the vehicle is like a filter,” explains Mazdak Tootkaboni, associate professor of civil and environmental engineering at UMass Dartmouth. “The excitation of the road goes through the vehicle and is then sensed by the cellphone. So, what we do is understand this filter and take it out of the equation.”
After developing their model, the Carbin team then sought to test it against more costly, conventional methods. They did this through two different validations.
In the first, they measured road quality on two test tracks in the Greater Boston area — a major thoroughfare and then a highway — using a conventional laser profiler and several phones equipped with Carbin. When they compared the data afterward, they found that Carbin could predict laser-based roughness measurements with 90 percent accuracy.
The second validation probed Carbin’s crowdsourcing capabilities. In it, they analyzed over 22,000 kilometers of Federal Highway Administration road data from California beside 27,000 kilometers of data gathered by 84 Carbin users from the same state. The results of their analysis revealed a remarkable resemblance between the crowdsourced and official data — a sign that Carbin could augment or even entirely replace conventional methods.
21st century infrastructure, 21st century tools
Now that they’ve thoroughly validated their model, Carbin’s developers want to expand the app to provide users, governments, and companies with unparalleled insights into both vehicles and infrastructure.
The most apparent use for Carbin, says Jake Roxon, a CSHub postdoc and Carbin’s creator, would be as a tool for DOTs to improve America’s roads — which recently received a grade of D from the American Society of Civil Engineers.
“On average, America’s roads are terrible,” he explains. “But the problem isn’t always in the funding of DOTs themselves, but rather how they allocate that funding. By knowing the quality of an entire road network, which is impossible with current technologies, they could fix roads more efficiently.”
The issue, then, is how Carbin can transition from gathering data to also recommending resource allocation. To make this possible, the Carbin team is beginning to incorporate prior CSHub research on network asset management — the process through which DOTs monitor pavement performance and plan maintenance to meet performance targets.
Besides serving the needs of DOTs, Carbin could also help private companies. “There are private firms, fleet companies especially, that would benefit from this technology,” says Roxon. “Eventually, they could use Carbin for ‘eco-routing,’ which is when you identify the route that is most fuel-efficient.”
Such a routing option could help companies both reduce their environmental impact and running costs — for those with thousands of vehicles, the aggregate savings could be substantial.
While further development is needed to incorporate eco-routing and asset management into Carbin, its developers see it as a promising tool. Franz-Josef Ulm, professor at the MIT Department of Civil and Environmental Engineering and faculty director of CSHub, believes that Carbin represents a necessary step forward.
“To develop the infrastructure of the 21st century, we need 21st-century means of assessing the state of that infrastructure to ensure that any dollar spent today is well spent for the future,” he says. “That’s precisely where Carbin enters the picture.” More
63 Shares119 Views
in Data Management & Statistics
Turning technology against human traffickers
6 May 2021, 18:35
Last October, the White House released the National Action Plan to Combat Human Trafficking. The plan was motivated, in part, by a greater understanding of the pervasiveness of the crime. In 2019, 11,500 situations of human trafficking in the United States were identified through the National Human Trafficking Hotline, and the federal government estimates there are nearly 25 million victims globally.
This increasing awareness has also motivated MIT Lincoln Laboratory, a federally funded research and development center, to harness its technological expertise toward combating human trafficking.
In recent years, researchers in the Humanitarian Assistance and Disaster Relief Systems Group have met with federal, state, and local agencies, nongovernmental organizations (NGOs), and technology companies to understand the challenges in identifying, investigating, and prosecuting trafficking cases. In 2019, the team compiled their findings and 29 targeted technology recommendations into a roadmap for the federal government. This roadmap informed the U.S. Department of Homeland Security’s recent counter-trafficking strategy released in 2020.
“Traffickers are using technology to gain efficiencies of scale, from online commercial sex marketplaces to complex internet-driven money laundering, and we must also leverage technology to counter them,” says Matthew Daggett, who is leading this research at the laboratory.
In July, Daggett testified at a congressional hearing about many of the current technology gaps and made several policy recommendations on the role of technology countering trafficking. “Taking advantage of digital evidence can be overwhelming for investigators. There’s not a lot of technology out there to pull it all together, and while there are pockets of tech activity, we see a lot of duplication of effort because this work is siloed across the community,” he adds.
Breaking down these silos has been part of Daggett’s goal. Most recently, he brought together almost 200 practitioners from 85 federal and state agencies, NGOs, universities, and companies for the Counter–Human Trafficking Technology Workshop at Lincoln Laboratory. This first-of-its-kind virtual event brought about discussions of how technology is used today, where gaps exist, and what opportunities exist for new partnerships.
The workshop was also an opportunity for the laboratory’s researchers to present several advanced tools in development. “The goal is to come up with sustainable ways to partner on transitioning these prototypes out into the field,” Daggett adds.
Uncovering networks
One the most mature capabilities at the laboratory in countering human trafficking deals with the challenge of discovering large-scale, organized trafficking networks.
“We cannot just disrupt pieces of an organized network, because many networks recover easily. We need to uncover the entirety of the network and disrupt it as a whole,” says Lin Li, a researcher in the Artificial Intelligence Technology Group.
To help investigators do that, Li has been developing machine learning algorithms that automatically analyze online commercial sex ads to reveal whether they are likely associated with human trafficking activities and if they belong to the same organization.
This task may have been easier only a few years ago, when a large percentage of trafficking-linked activities were advertised, and reported, from listings on Backpage.com. Backpage was the second-largest classified ad listing service in the United States after Craigslist, and was seized in 2018 by a multi-agency federal investigation. A slew of new advertising sites has since appeared in its wake. “Now we have a very decentralized distributed information source, where people are cross-posting on many web pages,” Li says. Traffickers are also becoming more security-aware, Li says, often using burner cellular or internet phones that make it difficult to use “hard” links such as phone numbers to uncover organized crime.
So, the researchers have instead been leveraging “soft” indicators of organized activity, such as semantic similarities in the ad descriptions. They use natural language processing to extract unique phrases in content to create ad templates, and then find matches for those templates across hundreds of thousands of ads from multiple websites.
“We’ve learned that each organization can have multiple templates that they use when they post their ads, and each template is more or less unique to the organization. By template matching, we essentially have an organization-discovery algorithm,” Li says.
In this analysis process, the system also ranks the likelihood of an ad being associated with human trafficking. By definition, human trafficking involves compelling individuals to provide service or labor through the use of force, fraud, or coercion — and does not apply to all commercial sex work. The team trained a language model to learn terms related to race, age, and other marketplace vernacular in the context of the ad that may be indicative of potential trafficking.
To show the impact of this system, Li gives an example scenario in which an ad is reported to law enforcement as being linked to human trafficking. A traditional search to find other ads using the same phone number might yield 600 ads. But by applying template matching, approximately 900 additional ads could be identified, enabling the discovery of previously unassociated phone numbers.
“We then map out this network structure, showing links between ad template clusters and their locations. Suddenly, you see a transnational network,” Li says. “It could be a very powerful way, starting with one ad, of discovering an organization’s entire operation.”
Analyzing digital evidence
Once a human trafficking investigation is underway, the process of analyzing evidence to find probable cause for warrants, corroborate victim statements, and build a case for prosecution can be very time- and human-intensive. A case folder might hold thousands of pieces of digital evidence — a conglomeration of business or government records, financial transactions, cell phone data, emails, photographs, social media profiles, audio or video recordings, and more.
“The wide range of data types and formats can make this process challenging. It’s hard to understand the interconnectivity of it all and what pieces of evidence hold answers,” Daggett says. “What investigators want is a way to search and visualize this data with the same ease they would a Google search.”
The system Daggett and his team are prototyping takes all the data contained in an evidence folder and indexes it, extracting the information inside each file into three major buckets — text, imagery, and audio data. These three types of data are then passed through specialized software processes to structure and enrich them, making them more useful for answering investigative questions.
The image processor, for example, can recognize and extract text, faces, and objects from images. The processor can then detect near-duplicate images in the evidence, making a link between an image that appears on a sex advertisement and the cell phone that took it, even for images that have been heavily edited or filtered. They are also working on facial recognition algorithms that can identify the unique faces within a set of evidence, model them, and find them elsewhere within the evidence files, under widely different lighting conditions and shooting angles. These techniques are useful for identifying additional victims and corroborating who knows whom.
Another enrichment capability allows investigators to find “signatures” of trafficking in the data. These signatures can be specific vernacular used, for example, in text messages between suspects that refer to illicit activity. Other trafficking signatures can be image-based, such as if the picture was taken in a hotel room, contains certain objects such as cash, or shows specific types of tattoos that traffickers use to brand their victims. A deep learning model the team is working on now is specifically aimed at recognizing crown tattoos associated with trafficking. “The challenge is to train the model to identify the signature across a wide range of crown tattoos that look very different from one another, and we’re seeing robust performance using this technique,” Daggett says.
One particularly time-intensive process for investigators is analyzing thousands of jail phone calls from suspects who are awaiting trial, for indications of witness tampering or continuing illicit operations. The laboratory has been leveraging automated speech recognition technology to develop a tool to allow investigators to partially transcribe and analyze the content of these conversations. This capability gives law enforcement a general idea of what a call might be about, helping them triage ones that should be prioritized for a closer look.
Finally, the team has been developing a series of user-facing tools that use all of the processed data to enable investigators to search, discover, and visualize connections between evidentiary artifacts, explore geolocated information on a map, and automatically build evidence timelines.
“The prosecutors really like the timeline tool, as this is one of the most labor-intensive tasks when preparing for trial,” Daggett says.
When users click on a document, a map pin, or a timeline entry, they see a data card that links back to the original artifacts. “These tools point you back to the primary evidence that cases can be built on,” Daggett says. “A lot of this prototyping is picking what might be called low-hanging fruit, but it’s really more like fruit already on the ground that is useful and just isn’t getting picked up.”
Victim-centered training
These data analytics are especially useful for helping law enforcement corroborate victim statements. Victims may be fearful or unwilling to provide a full picture of their experience to investigators, or may have difficulty recalling traumatic events. The more nontestimonial evidence that prosecutors can use to tell the story to a jury, the less pressure prosecutors must place on victims to help secure a conviction. There is greater awareness of the retraumatization that can occur during the investigation and trial processes.
“In the last decade, there has been a greater shift toward a victim-centered approach to investigations,” says Hayley Reynolds, an assistant leader in the Human Health and Performance Systems Group and one of the early leaders of counter–human trafficking research at the laboratory. “There’s a greater understanding that you can’t bring the case to trial if a survivor’s needs are not kept at the forefront.”
Improving training for law enforcement, specifically in interacting with victims, was one of the team’s recommendation in the trafficking technology roadmap. In this area, the laboratory has been developing a scenario-based training capability that uses game-play mechanics to inform law enforcement on aspects of trauma-informed victim interviewing. The training, called a “serious game,” helps officers experience how the approach they choose to gather information can build rapport and trust with a victim, or can reduce the feeling of safety and retraumatize victims. The capability is currently being evaluated by several organizations that specialize in victim-centered practitioner training. The laboratory recently published a journal on serious games built for multiple mission areas over the last decade.
Daggett says that prototyping in partnership with the state and federal investigators and prosecutors that these tools are intended for is critical. “Everything we do must be user-centered,” he says. “We study their existing workflows and processes in detail, present ideas for technologies that could improve their work, and they rate what would have the most operational utility. It’s our way to methodically figure out how to solve the most critical problems,” Daggett says.
When Daggett gave congressional testimony in July, he spoke of the need to establish a unified, interagency entity focused on R&D for countering human trafficking. Since then, some progress has been made toward that goal — the federal government has now launched the Center for Countering Human Trafficking, the first integrated center to support investigations and intelligence analysis, outreach and training activities, and victim assistance.
Daggett hopes that future collaborations will enable technologists to apply their work toward capabilities needed most by the community. “Thoughtfully designed technology can empower the collective counter–human trafficking community and disrupt these illicit operations. Increased R&D holds the potential make a tremendous impact by accelerating justice and hastening the healing of victims.” More
138 Shares119 Views
in Data Management & Statistics
Physicists find a novel way to switch antiferromagnetism on and off
6 May 2021, 04:00
When you save an image to your smartphone, those data are written onto tiny transistors that are electrically switched on or off in a pattern of “bits” to represent and encode that image. Most transistors today are made from silicon, an element that scientists have managed to switch at ever-smaller scales, enabling billions of bits, and therefore large libraries of images and other files, to be packed onto a single memory chip.
But growing demand for data, and the means to store them, is driving scientists to search beyond silicon for materials that can push memory devices to higher densities, speeds, and security.
Now MIT physicists have shown preliminary evidence that data might be stored as faster, denser, and more secure bits made from antiferromagnets.
Antiferromagnetic, or AFM materials are the lesser-known cousins to ferromagnets, or conventional magnetic materials. Where the electrons in ferromagnets spin in synchrony — a property that allows a compass needle to point north, collectively following the Earth’s magnetic field — electrons in an antiferromagnet prefer the opposite spin to their neighbor, in an “antialignment” that effectively quenches magnetization even at the smallest scales.
The absence of net magnetization in an antiferromagnet makes it impervious to any external magnetic field. If they were made into memory devices, antiferromagnetic bits could protect any encoded data from being magnetically erased. They could also be made into smaller transistors and packed in greater numbers per chip than traditional silicon.
Now the MIT team has found that by doping extra electrons into an antiferromagnetic material, they can turn its collective antialigned arrangement on and off, in a controllable way. They found this magnetic transition is reversible, and sufficiently sharp, similar to switching a transistor’s state from 0 to 1. The results, published today in Physical Review Letters, demonstrate a potential new pathway to use antiferromagnets as a digital switch.
“An AFM memory could enable scaling up the data storage capacity of current devices — same volume, but more data,” says the study’s lead author Riccardo Comin, assistant professor of physics at MIT.
Comin’s MIT co-authors include lead author and graduate student Jiarui Li, along with Zhihai Zhu, Grace Zhang, and Da Zhou; as well as Roberg Green of the University of Saskatchewan; Zhen Zhang, Yifei Sun, and Shriram Ramanathan of Purdue University; Ronny Sutarto and Feizhou He of Canadian Light Source; and Jerzy Sadowski at Brookhaven National Laboratory.
Magnetic memory
To improve data storage, some researchers are looking to MRAM, or magnetoresistive RAM, a type of memory system that stores data as bits made from conventional magnetic materials. In principle, an MRAM device would be patterned with billions of magnetic bits. To encode data, the direction of a local magnetic domain within the device is flipped, similar to switching a transistor from 0 to 1.
MRAM systems could potentially read and write data faster than silicon-based devices and could run with less power. But they could also be vulnerable to external magnetic fields.
“The system as a whole follows a magnetic field like a sunflower follows the sun, which is why, if you take a magnetic data storage device and put it in a moderate magnetic field, information is completely erased,” Comin says.
Antiferromagnets, in contrast, are unaffected by external fields and could therefore be a more secure alternative to MRAM designs. An essential step toward encodable AFM bits is the ability to switch antiferromagnetism on and off. Researchers have found various ways to accomplish this, mostly by using electric current to switch a material from its orderly antialignment, to a random disorder of spins.
“With these approaches, switching is very fast,” says Li. “But the downside is, everytime you need a current to read or write, that requires a lot of energy per operation. When things get very small, the energy and heat generated by running currents are significant.”
Doped disorder
Comin and his colleagues wondered whether they could achieve antiferromagnetic switching in a more efficient manner. In their new study, they work with neodymium nickelate, an antiferromagnetic oxide grown in the Ramanathan lab. This material exhibits nanodomains that consist of nickel atoms with an opposite spin to that of its neighbor, and held together by oxygen and neodymium atoms. The researchers had previously mapped the material’s fractal properties.
Since then, the researchers have looked to see if they could manipulate the material’s antiferromagnetism via doping — a process that intentionally introduces impurities in a material to alter its electronic properties. In their case, the researchers doped neodymium nickel oxide by stripping the material of its oxygen atoms.
When an oxygen atom is removed, it leaves behind two electrons, which are redistributed among the other nickel and oxygen atoms. The researchers wondered whether stripping away many oxygen atoms would result in a domino effect of disorder that would switch off the material’s orderly antialignment.
To test their theory, they grew 100-nanometer-thin films of neodymium nickel oxide and placed them in an oxygen-starved chamber, then heated the samples to temperatures of 400 degrees Celsius to encourage oxygen to escape from the films and into the chamber’s atmosphere.
As they removed progressively more oxygen, they studied the films using advanced magnetic X-ray crystallography techniques to determine whether the material’s magnetic structure was intact, implying that its atomic spins remained in their orderly antialignment, and therefore retained antiferomagnetism. If their data showed a lack of an ordered magnetic structure, it would be evidence that the material’s antiferromagnetism had switched off, due to sufficient doping.
Through their experiments, the researchers were able to switch off the material’s antiferromagnetism at a certain critical doping threshold. They could also restore antiferromagnetism by adding oxygen back into the material.
Now that the team has shown doping effectively switches AFM on and off, scientists might use more practical ways to dope similar materials. For instance, silicon-based transistors are switched using voltage-activated “gates,” where a small voltage is applied to a bit to alter its electrical conductivity. Comin says that antiferromagnetic bits could also be switched using suitable voltage gates, which would require less energy than other antiferromagnetic switching techniques.
“This could present an opportunity to develop a magnetic memory storage device that works similarly to silicon-based chips, with the added benefit that you can store information in AFM domains that are very robust and can be packed at high densities,” Comin says. “That’s key to addressing the challenges of a data-driven world.”
This research was supported, in part, by the Air Force Office of Scientific Research Young Investigator Program and the Natural Sciences and Engineering Research Council of Canada. This research used resources of the Center for Functional Nanomaterials and National Synchrotron Light Source II, both U.S. Department of Energy Office of Science User Facilities located at Brookhaven National Laboratory. More
125 Shares159 Views
in Data Management & Statistics
Robotic solution for disinfecting food production plants wins agribusiness prize
30 April 2021, 14:00
The winners of this year’s Rabobank-MIT Food and Agribusiness Innovation Prize got a good indication their pitch was striking a chord when a judge offered to have his company partner with the team for an early demonstration. The offer signified demand for their solution — to say nothing of their chances of winning the pitch competition.
The annual competition’s MIT-based grand-prize winner, Human Dynamics, is seeking to improve sanitation in food production plants with a robotic drone — a “drobot” — that flies through facilities spraying soap and disinfectant.
The company says the product addresses major labor shortages for food production facilities, which often must carry out daily sanitation processes.
“They have to sanitize every night, and it’s extremely labor intensive and expensive,” says co-founder Tom Okamoto, a master’s student in MIT’s System Design and Management (SDM) program.
In the winning pitch, Okamoto said the average large food manufacturer spends $13 million on sanitation annually. When you combine the time sanitation processes takes away from production and delays due to human error, Human Dynamics estimates it’s tackling an $80 billion problem.
The company’s prototype uses a quadcopter drone that carries a tank, nozzle, and spray hose. Underneath the hood, the drone uses visual detection technology to validate that each area is clean, LIDAR to map out its path, and algorithms for route optimization.
The product is designed to automate repetitive tasks while complementing other cleaning efforts currently done by humans. Workers will still be required for certain aspects of cleaning and tasks like preparing and inspecting facilities during sanitation.
The company has already developed several proofs of concept and is planning to run a pilot project with a local food producer and distributor this summer.
The Human Dynamics team also includes MIT researcher Takahiro Nozaki, MIT master’s student Julia Chen, and Harvard Business School students Mike Mancinelli and Kaz Yoshimaru.
The company estimates that the addressable market for sanitation in food production facilities in the country is $3 billion.
The second-place prize went to Resourceful, which aims to help connect buyers and sellers of food waste byproducts through an online platform. The company says there’s a growing market for upcycled products made by companies selling things like edible chips made from juice pulp, building materials made from potato skins, and eyeglasses made from orange peels. But establishing a byproduct supply chain can be difficult.
“Being paid for byproducts should be low-hanging fruit for food manufacturers, but the system is broken,” says co-founder and CEO Kyra Atekwana, an MBA candidate at the University of Chicago’s Booth School of Business. “There are tens of millions of pounds of food waste produced in the U.S. every year, and there’s a variety of tech solutions … enabling this food waste and surplus to be captured by consumers. But there’s virtually nothing in the middle to unlock access to the 10.6 million tons of byproduct waste produced every year.”
Buyers and sellers can offer and browse food waste byproducts on the company’s subscription-based platform. The businesses can also connect and establish contracts through the platform. Resourceful charges a small fee for each transaction.
The company is currently launching pilots in the Chicago region before making a public launch later this year. It has also partnered with the Upcycled Food Association, a nonprofit focused on reducing food waste.
The winners were chosen from a group of seven finalist teams. Other finalists included:
Chicken Haus, a vertically integrated, fast-casual restaurant concept dedicated to serving locally sourced, bone-in fried chicken;
Joise Food Technologies, which is 3-D printing the next-generation of meat alternatives and other foods using 3-D biofabrication technology and sustainable food ink formulation;
Marble, which is developing a small-footprint robot to remove fat from the surface of meat cuts to achieve optimal yield;
Nice Rice, which is developing a rice alternative made from pea starch, which can be upcycled; and
Roofscapes, which deploys accessible wooden platforms to “vegetalize” roofs in dense urban areas to combat food insecurity and climate change.
This was the sixth year of the event, which was hosted by the MIT Food and Agriculture Club. The event was sponsored by Rabobank and MIT’s Abdul Latif Jameel World Water and Food Systems Lab (J-WAFS). More
163 Shares169 Views
in Data Management & Statistics
Five from MIT elected to American Academy of Arts and Sciences for 2021
22 April 2021, 17:30
Five MIT faculty members are among more than 250 leaders from academia, business, public affairs, the humanities, and the arts elected to the American Academy of Arts and Sciences, the academy announced Thursday.
One of the nation’s most prestigious honorary societies, the academy is also a leading center for independent policy research. Members contribute to academy publications, as well as studies of science and technology policy, energy and global security, social policy and American institutions, the humanities and culture, and education.
Those elected from MIT this year are:
Linda Griffith, the School of Engineering Professor of Teaching Innovation, Biological Engineering, and Mechanical engineering;
Muriel Médard, the Cecil H. Green Professor in the Department of Electrical Engineering;
Leona Samson, professor of biological engineering and biology;
Scott Sheffield, the Leighton Family Professor in the Department of Mathematics; and
Li-Huei Tsai, the Picower Professor in the Department of Brain and Cognitive Sciences.
“We are honoring the excellence of these individuals, celebrating what they have achieved so far, and imagining what they will continue to accomplish,” says David Oxtoby, president of the academy. “The past year has been replete with evidence of how things can get worse; this is an opportunity to illuminate the importance of art, ideas, knowledge, and leadership that can make a better world.”
Since its founding in 1780, the academy has elected leading thinkers from each generation, including George Washington and Benjamin Franklin in the 18th century, Maria Mitchell and Daniel Webster in the 19th century, and Toni Morrison and Albert Einstein in the 20th century. The current membership includes more than 250 Nobel and Pulitzer Prize winners. More

Markus Andrews

More stories

Could all your digital photos be stored as DNA?

Taking an indirect path into a bright future

MIT baseball coach uses sensors, motion capture technology to teach pitching

Crowdsourcing data on road quality and excess fuel consumption

Turning technology against human traffickers

Physicists find a novel way to switch antiferromagnetism on and off

Robotic solution for disinfecting food production plants wins agribusiness prize

Five from MIT elected to American Academy of Arts and Sciences for 2021

ITALIAN LANGUAGE

ENGLISH LANGUAGE