Software Archivi - technology-news.space - All about the world of technology!

Latest story

150 Shares189 Views

New software enables blind and low-vision users to create interactive, accessible charts

by Markus Andrews 27 March 2024, 03:00

A growing number of tools enable users to make online data representations, like charts, that are accessible for people who are blind or have low vision. However, most tools require an existing visual chart that can then be converted into an accessible format.

This creates barriers that prevent blind and low-vision users from building their own custom data representations, and it can limit their ability to explore and analyze important information.

A team of researchers from MIT and University College London (UCL) wants to change the way people think about accessible data representations.

They created a software system called Umwelt (which means “environment” in German) that can enable blind and low-vision users to build customized, multimodal data representations without needing an initial visual chart.

Umwelt, an authoring environment designed for screen-reader users, incorporates an editor that allows someone to upload a dataset and create a customized representation, such as a scatterplot, that can include three modalities: visualization, textual description, and sonification. Sonification involves converting data into nonspeech audio.

The system, which can represent a variety of data types, includes a viewer that enables a blind or low-vision user to interactively explore a data representation, seamlessly switching between each modality to interact with data in a different way.

The researchers conducted a study with five expert screen-reader users who found Umwelt to be useful and easy to learn. In addition to offering an interface that empowered them to create data representations — something they said was sorely lacking — the users said Umwelt could facilitate communication between people who rely on different senses.

“We have to remember that blind and low-vision people aren’t isolated. They exist in these contexts where they want to talk to other people about data,” says Jonathan Zong, an electrical engineering and computer science (EECS) graduate student and lead author of a paper introducing Umwelt. “I am hopeful that Umwelt helps shift the way that researchers think about accessible data analysis. Enabling the full participation of blind and low-vision people in data analysis involves seeing visualization as just one piece of this bigger, multisensory puzzle.”

Joining Zong on the paper are fellow EECS graduate students Isabella Pedraza Pineros and Mengzhu “Katie” Chen; Daniel Hajas, a UCL researcher who works with the Global Disability Innovation Hub; and senior author Arvind Satyanarayan, associate professor of computer science at MIT who leads the Visualization Group in the Computer Science and Artificial Intelligence Laboratory. The paper will be presented at the ACM Conference on Human Factors in Computing.

De-centering visualization

The researchers previously developed interactive interfaces that provide a richer experience for screen reader users as they explore accessible data representations. Through that work, they realized most tools for creating such representations involve converting existing visual charts.

Aiming to decenter visual representations in data analysis, Zong and Hajas, who lost his sight at age 16, began co-designing Umwelt more than a year ago.

At the outset, they realized they would need to rethink how to represent the same data using visual, auditory, and textual forms.

“We had to put a common denominator behind the three modalities. By creating this new language for representations, and making the output and input accessible, the whole is greater than the sum of its parts,” says Hajas.

To build Umwelt, they first considered what is unique about the way people use each sense.

For instance, a sighted user can see the overall pattern of a scatterplot and, at the same time, move their eyes to focus on different data points. But for someone listening to a sonification, the experience is linear since data are converted into tones that must be played back one at a time.

“If you are only thinking about directly translating visual features into nonvisual features, then you miss out on the unique strengths and weaknesses of each modality,” Zong adds.

They designed Umwelt to offer flexibility, enabling a user to switch between modalities easily when one would better suit their task at a given time.

To use the editor, one uploads a dataset to Umwelt, which employs heuristics to automatically creates default representations in each modality.

If the dataset contains stock prices for companies, Umwelt might generate a multiseries line chart, a textual structure that groups data by ticker symbol and date, and a sonification that uses tone length to represent the price for each date, arranged by ticker symbol.

The default heuristics are intended to help the user get started.

“In any kind of creative tool, you have a blank-slate effect where it is hard to know how to begin. That is compounded in a multimodal tool because you have to specify things in three different representations,” Zong says.

The editor links interactions across modalities, so if a user changes the textual description, that information is adjusted in the corresponding sonification. Someone could utilize the editor to build a multimodal representation, switch to the viewer for an initial exploration, then return to the editor to make adjustments.

Helping users communicate about data

To test Umwelt, they created a diverse set of multimodal representations, from scatterplots to multiview charts, to ensure the system could effectively represent different data types. Then they put the tool in the hands of five expert screen reader users.

Study participants mostly found Umwelt to be useful for creating, exploring, and discussing data representations. One user said Umwelt was like an “enabler” that decreased the time it took them to analyze data. The users agreed that Umwelt could help them communicate about data more easily with sighted colleagues.

“What stands out about Umwelt is its core philosophy of de-emphasizing the visual in favor of a balanced, multisensory data experience. Often, nonvisual data representations are relegated to the status of secondary considerations, mere add-ons to their visual counterparts. However, visualization is merely one aspect of data representation. I appreciate their efforts in shifting this perception and embracing a more inclusive approach to data science,” says JooYoung Seo, an assistant professor in the School of Information Sciences at the University of Illinois at Urbana-Champagne, who was not involved with this work.

Moving forward, the researchers plan to create an open-source version of Umwelt that others can build upon. They also want to integrate tactile sensing into the software system as an additional modality, enabling the use of tools like refreshable tactile graphics displays.

“In addition to its impact on end users, I am hoping that Umwelt can be a platform for asking scientific questions around how people use and perceive multimodal representations, and how we can improve the design beyond this initial step,” says Zong.

This work was supported, in part, by the National Science Foundation and the MIT Morningside Academy for Design Fellowship. More

More stories

75 Shares159 Views
in Data Management & Statistics
“We offer another place for knowledge”
by Markus Andrews 26 February 2024, 18:35
In the Dzaleka Refugee Camp in Malawi, Jospin Hassan didn’t have access to the education opportunities he sought. So, he decided to create his own.
Hassan knew the booming fields of data science and artificial intelligence could bring job opportunities to his community and help solve local challenges. After earning a spot in the 2020-21 cohort of the Certificate Program in Computer and Data Science from MIT Refugee Action Hub (ReACT), Hassan started sharing MIT knowledge and skills with other motivated learners in Dzaleka.
MIT ReACT is now Emerging Talent, part of the Jameel World Education Lab (J-WEL) at MIT Open Learning. Currently serving its fifth cohort of global learners, Emerging Talent’s year-long certificate program incorporates high-quality computer science and data analysis coursework from MITx, professional skill building, experiential learning, apprenticeship work, and opportunities for networking with MIT’s global community of innovators. Hassan’s cohort honed their leadership skills through interactive online workshops with J-WEL and the 10-week online MIT Innovation Leadership Bootcamp.
“My biggest takeaway was networking, collaboration, and learning from each other,” Hassan says.
Today, Hassan’s organization ADAI Circle offers mentorship and education programs for youth and other job seekers in the Dzaleka Refugee Camp. The curriculum encourages hands-on learning and collaboration.
Launched in 2020, ADAI Circle aims to foster job creation and reduce poverty in Malawi through technology and innovation. In addition to their classes in data science, AI, software development, and hardware design, their Innovation Hub offers internet access to anyone in need.
Doing something different in the community
Hassan first had the idea for his organization in 2018 when he reached a barrier in his own education journey. There were several programs in the Dzaleka Refugee Camp teaching learners how to code websites and mobile apps, but Hassan felt that they were limited in scope.
“We had good devices and internet access,” he says, “but I wanted to learn something new.”
Teaming up with co-founder Patrick Byamasu, Hassan and Byamasu set their sights on the longevity of AI and how that might create more jobs for people in their community. “The world is changing every day, and data scientists are in a higher demand today in various companies,” Hassan says. “For this reason, I decided to expand and share the knowledge that I acquired with my fellow refugees and the surrounding villages.”
ADAI Circle draws inspiration from Hassan’s own experience with MIT Emerging Talent coursework, community, and training opportunities. For example, the MIT Bootcamps model is now standard practice for ADAI Circle’s annual hackathon. Hassan first introduced the hackathon to ADAI Circle students as part of his final experiential learning project of the Emerging Talent certificate program.
ADAI Circle’s annual hackathon is now an interactive — and effective — way to select students who will most benefit from its programs. The local schools’ curricula, Hassan says, might not provide enough of an academic challenge. “We can’t teach everyone and accommodate everyone because there are a lot of schools,” Hassan says, “but we offer another place for knowledge.”
The hackathon helps students develop data science and robotics skills. Before they start coding, students have to convince ADAI Circle teachers that their designs are viable, answering questions like, “What problem are you solving?” and “How will this help the community?” A community-oriented mindset is just as important to the curriculum.
In addition to the practical skills Hassan gained from Emerging Talent, he leveraged the program’s network to help his community. Thanks to a social media connection Hassan made with the nongovernmental organization Give Internet after one of Emerging Talent’s virtual events, Give Internet brought internet access to ADAI Circle.
Bridging the AI gap to unmet communities
In 2023, ADAI Circle connected with another MIT Open Learning program, Responsible AI for Social Empowerment and Education (RAISE), which led to a pilot test of a project-based AI curriculum for middle school students. The Responsible AI for Computational Action (RAICA) curriculum equipped ADAI Circle students with AI skills for chatbots and natural language processing.
“I liked that program because it was based on what we’re teaching at the center,” Hassan says, speaking of his organization’s mission of bridging the AI gap to reach unmet communities.
The RAICA curriculum was designed by education experts at MIT Scheller Teacher Education Program (STEP Lab) and AI experts from MIT Personal Robots group and MIT App Inventor. ADAI Circle teachers gave detailed feedback about the pilot to the RAICA team. During weekly meetings with Glenda Stump, education research scientist for RAICA and J-WEL, and Angela Daniel, teacher development specialist for RAICA, the teachers discussed their experiences, prepared for upcoming lessons, and translated the learning materials in real time.
“We are trying to create a curriculum that’s accessible worldwide and to students who typically have little or no access to technology,” says Mary Cate Gustafson-Quiett, curriculum design manager at STEP Lab and project manager for RAICA. “Working with ADAI and students in a refugee camp challenged us to design in more culturally and technologically inclusive ways.”
Gustafson-Quiett says the curriculum feedback from ADAI Circle helped inform how RAICA delivers teacher development resources to accommodate learning environments with limited internet access. “They also exposed places where our team’s western ideals, specifically around individualism, crept into activities in the lesson and contrasted with their more communal cultural beliefs,” she says.
Eager to introduce more MIT-developed AI resources, Hassan also shared MIT RAISE’s Day of AI curricula with ADAI Circle teachers. The new ChatGPT module gave students the chance to level up their chatbot programming skills that they gained from the RAICA module. Some of the advanced students are taking initiative to use ChatGPT API to create their own projects in education.
“We don’t want to tell them what to do, we want them to come up with their own ideas,” Hassan says.
Although ADAI Circle faces many challenges, Hassan says his team is addressing them one by one. Last year, they didn’t have electricity in their Innovation Hub, but they solved that. This year, they achieved a stable internet connection that’s one of the fastest in Malawi. Next up, they are hoping to secure more devices for their students, create more jobs, and add additional hubs throughout the community. The work is never done, but Hassan is starting to see the impact that ADAI Circle is making.
“For those who want to learn data science, let’s let them learn,” Hassan says. More
63 Shares129 Views
in Data Management & Statistics
Generating opportunities with generative AI
by Markus Andrews 2 November 2023, 15:15
Talking with retail executives back in 2010, Rama Ramakrishnan came to two realizations. First, although retail systems that offered customers personalized recommendations were getting a great deal of attention, these systems often provided little payoff for retailers. Second, for many of the firms, most customers shopped only once or twice a year, so companies didn’t really know much about them.
“But by being very diligent about noting down the interactions a customer has with a retailer or an e-commerce site, we can create a very nice and detailed composite picture of what that person does and what they care about,” says Ramakrishnan, professor of the practice at the MIT Sloan School of Management. “Once you have that, then you can apply proven algorithms from machine learning.”
These realizations led Ramakrishnan to found CQuotient, a startup whose software has now become the foundation for Salesforce’s widely adopted AI e-commerce platform. “On Black Friday alone, CQuotient technology probably sees and interacts with over a billion shoppers on a single day,” he says.
After a highly successful entrepreneurial career, in 2019 Ramakrishnan returned to MIT Sloan, where he had earned master’s and PhD degrees in operations research in the 1990s. He teaches students “not just how these amazing technologies work, but also how do you take these technologies and actually put them to use pragmatically in the real world,” he says.
Additionally, Ramakrishnan enjoys participating in MIT executive education. “This is a great opportunity for me to convey the things that I have learned, but also as importantly, to learn what’s on the minds of these senior executives, and to guide them and nudge them in the right direction,” he says.
For example, executives are understandably concerned about the need for massive amounts of data to train machine learning systems. He can now guide them to a wealth of models that are pre-trained for specific tasks. “The ability to use these pre-trained AI models, and very quickly adapt them to your particular business problem, is an incredible advance,” says Ramakrishnan.
Rama Ramakrishnan – Utilizing AI in Real World Applications for Intelligent WorkVideo: MIT Industrial Liaison Program
Understanding AI categories
“AI is the quest to imbue computers with the ability to do cognitive tasks that typically only humans can do,” he says. Understanding the history of this complex, supercharged landscape aids in exploiting the technologies.
The traditional approach to AI, which basically solved problems by applying if/then rules learned from humans, proved useful for relatively few tasks. “One reason is that we can do lots of things effortlessly, but if asked to explain how we do them, we can’t actually articulate how we do them,” Ramakrishnan comments. Also, those systems may be baffled by new situations that don’t match up to the rules enshrined in the software.
Machine learning takes a dramatically different approach, with the software fundamentally learning by example. “You give it lots of examples of inputs and outputs, questions and answers, tasks and responses, and get the computer to automatically learn how to go from the input to the output,” he says. Credit scoring, loan decision-making, disease prediction, and demand forecasting are among the many tasks conquered by machine learning.
But machine learning only worked well when the input data was structured, for instance in a spreadsheet. “If the input data was unstructured, such as images, video, audio, ECGs, or X-rays, it wasn’t very good at going from that to a predicted output,” Ramakrishnan says. That means humans had to manually structure the unstructured data to train the system.
Around 2010 deep learning began to overcome that limitation, delivering the ability to directly work with unstructured input data, he says. Based on a longstanding AI strategy known as neural networks, deep learning became practical due to the global flood tide of data, the availability of extraordinarily powerful parallel processing hardware called graphics processing units (originally invented for video games) and advances in algorithms and math.
Finally, within deep learning, the generative AI software packages appearing last year can create unstructured outputs, such as human-sounding text, images of dogs, and three-dimensional models. Large language models (LLMs) such as OpenAI’s ChatGPT go from text inputs to text outputs, while text-to-image models such as OpenAI’s DALL-E can churn out realistic-appearing images.
Rama Ramakrishnan – Making Note of Little Data to Improve Customer ServiceVideo: MIT Industrial Liaison Program
What generative AI can (and can’t) do
Trained on the unimaginably vast text resources of the internet, a LLM’s “fundamental capability is to predict the next most likely, most plausible word,” Ramakrishnan says. “Then it attaches the word to the original sentence, predicts the next word again, and keeps on doing it.”
“To the surprise of many, including a lot of researchers, an LLM can do some very complicated things,” he says. “It can compose beautifully coherent poetry, write Seinfeld episodes, and solve some kinds of reasoning problems. It’s really quite remarkable how next-word prediction can lead to these amazing capabilities.”
“But you have to always keep in mind that what it is doing is not so much finding the correct answer to your question as finding a plausible answer your question,” Ramakrishnan emphasizes. Its content may be factually inaccurate, irrelevant, toxic, biased, or offensive.
That puts the burden on users to make sure that the output is correct, relevant, and useful for the task at hand. “You have to make sure there is some way for you to check its output for errors and fix them before it goes out,” he says.
Intense research is underway to find techniques to address these shortcomings, adds Ramakrishnan, who expects many innovative tools to do so.
Finding the right corporate roles for LLMs
Given the astonishing progress in LLMs, how should industry think about applying the software to tasks such as generating content?
First, Ramakrishnan advises, consider costs: “Is it a much less expensive effort to have a draft that you correct, versus you creating the whole thing?” Second, if the LLM makes a mistake that slips by, and the mistaken content is released to the outside world, can you live with the consequences?
“If you have an application which satisfies both considerations, then it’s good to do a pilot project to see whether these technologies can actually help you with that particular task,” says Ramakrishnan. He stresses the need to treat the pilot as an experiment rather than as a normal IT project.
Right now, software development is the most mature corporate LLM application. “ChatGPT and other LLMs are text-in, text-out, and a software program is just text-out,” he says. “Programmers can go from English text-in to Python text-out, as well as you can go from English-to-English or English-to-German. There are lots of tools which help you write code using these technologies.”
Of course, programmers must make sure the result does the job properly. Fortunately, software development already offers infrastructure for testing and verifying code. “This is a beautiful sweet spot,” he says, “where it’s much cheaper to have the technology write code for you, because you can very quickly check and verify it.”
Another major LLM use is content generation, such as writing marketing copy or e-commerce product descriptions. “Again, it may be much cheaper to fix ChatGPT’s draft than for you to write the whole thing,” Ramakrishnan says. “However, companies must be very careful to make sure there is a human in the loop.”
LLMs also are spreading quickly as in-house tools to search enterprise documents. Unlike conventional search algorithms, an LLM chatbot can offer a conversational search experience, because it remembers each question you ask. “But again, it will occasionally make things up,” he says. “In terms of chatbots for external customers, these are very early days, because of the risk of saying something wrong to the customer.”
Overall, Ramakrishnan notes, we’re living in a remarkable time to grapple with AI’s rapidly evolving potentials and pitfalls. “I help companies figure out how to take these very transformative technologies and put them to work, to make products and services much more intelligent, employees much more productive, and processes much more efficient,” he says. More
100 Shares189 Views
in Data Management & Statistics
Improving accessibility of online graphics for blind users
by Markus Andrews 2 October 2023, 18:20
The beauty of a nice infographic published alongside a news or magazine story is that it makes numeric data more accessible to the average reader. But for blind and visually impaired users, such graphics often have the opposite effect.
For visually impaired users — who frequently rely on screen-reading software that speaks words or numbers aloud as the user moves a cursor across the screen — a graphic may be nothing more than a few words of alt text, such as a chart’s title. For instance, a map of the United States displaying population rates by county might have alt text in the HTML that says simply, “A map of the United States with population rates by county.” The data has been buried in an image, making it entirely inaccessible.
“Charts have these various visual features that, as a [sighted] reader, you can shift your attention around, look at high-level patterns, look at individual data points, and you can do this on the fly,” says Jonathan Zong, a 2022 MIT Morningside Academy for Design (MAD) Fellow and PhD student in computer science, who points out that even when a graphic includes alt text that interprets the data, the visually impaired user must accept the findings as presented.
“If you’re [blind and] using a screen reader, the text description imposes a linear predefined reading order. So, you’re beholden to the decisions that the person who wrote the text made about what information was important to include.”
While some graphics do include data tables that a screen reader can read, it requires the user to remember all the data from each row and column as they move on to the next one. According to the National Federation of the Blind, Zong says, there are 7 million people living in the United States with visual disabilities, and nearly 97 percent of top-level pages on the internet are not accessible to screen readers. The problem, he points out, is an especially difficult one for blind researchers to get around. Some researchers with visual impairments rely on a sighted collaborator to read and help interpret graphics in peer-reviewed research.
Working with the Visualization Group at the Computer Science and Artificial Intelligence Lab (CSAIL) on a project led by Associate Professor Arvind Satyanarayan that includes Daniel Hajas, a blind researcher and innovation manager at the Global Disability Innovation Hub in England, Zong and others have written an open-source Javascript software program named Olli that solves this problem when it’s included on a website. Olli is able to go from big-picture analysis of a chart to the finest grain of detail to give the user the ability to select the degree of granularity that interests them.
“We want to design richer screen-reader experiences for visualization with a hierarchical structure, multiple ways to navigate, and descriptions at varying levels of granularity to provide self-guided, open-ended exploration for the user.”
Next steps with Olli are incorporating multi-sensory software to integrate text and visuals with sound, such as having a musical note that moves up or down the harmonic scale to indicate the direction of data on a linear graph, and possibly even developing tactile interpretations of data. Like most of the MAD Fellows, Zong integrates his science and engineering skills with design and art to create solutions to real-world problems affecting individuals. He’s been recognized for his work in both the visual arts and computer science. He holds undergraduate degrees in computer science and visual arts with a focus on graphic design from Princeton University, where his research was on the ethics of data collection.
“The throughline is the idea that design can help us make progress on really tough social and ethical questions,” Zong says, calling software for accessible data visualization an “intellectually rich area for design.” “We’re thinking about ways to translate charts and graphs into text descriptions that can get read aloud as speech, or thinking about other kinds of audio mappings to sonify data, and we’re even exploring some tactile methods to understand data,” he says.
“I get really excited about design when it’s a way to both create things that are useful to people in everyday life and also make progress on larger conversations about technology and society. I think working in accessibility is a great way to do that.”
Another problem at the intersection of technology and society is the ethics of taking user data from social media for large-scale studies without the users’ awareness. While working as a summer graduate research fellow at Cornell’s Citizens and Technology Lab, Zong helped create an open-source software called Bartleby that can be used in large anonymous data research studies. After researchers collect data, but before analysis, Bartleby would automatically send an email message to every user whose data was included, alert them to that fact and offer them the choice to review the resulting data table and opt out of the study. Bartleby was honored in the student category of Fast Company’s Innovation by Design Awards for 2022. In November the same year, Forbes magazine named Jonathan Zong in its Forbes 30 Under 30 in Science 2023 list for his work in data visualization accessibility.
The underlying theme to all Zong’s work is the exploration of autonomy and agency, even in his artwork, which is heavily inclusive of text and semiotic play. In “Public Display,” he created a handmade digital display font by erasing parts of celebrity faces that were taken from a facial recognition dataset. The piece was exhibited in 2020 in MIT’s Wiesner Gallery, and received the third-place prize in the MIT Schnitzer Prize in the Visual Arts that year. The work deals not only with the neurological aspects of distinguishing faces from typefaces, but also with the implications for erasing individuals’ identities through the practice of using facial recognition programs that often target individuals in communities of color in unfair ways. Another of his works, “Biometric Sans,” a typography system that stretches letters based on a person’s typing speed, will be included in a show at the Harvard Science Center sometime next fall.
“MAD, particularly the large events MAD jointly hosted, played a really important function in showing the rest of MIT that this is the kind of work we value. This is what design can look like and is capable of doing. I think it all contributes to that culture shift where this kind of interdisciplinary work can be valued, recognized, and serve the public.
“There are shared ideas around embodiment and representation that tie these different pursuits together for me,” Zong says. “In the ethics work, and the art on surveillance, I’m thinking about whether data collectors are representing people the way they want to be seen through data. And similarly, the accessibility work is about whether we can make systems that are flexible to the way people want to use them.” More
100 Shares139 Views
in Data Management & Statistics
Fast-tracking fusion energy’s arrival with AI and accessibility
by Markus Andrews 1 September 2023, 15:30
As the impacts of climate change continue to grow, so does interest in fusion’s potential as a clean energy source. While fusion reactions have been studied in laboratories since the 1930s, there are still many critical questions scientists must answer to make fusion power a reality, and time is of the essence. As part of their strategy to accelerate fusion energy’s arrival and reach carbon neutrality by 2050, the U.S. Department of Energy (DoE) has announced new funding for a project led by researchers at MIT’s Plasma Science and Fusion Center (PSFC) and four collaborating institutions.
Cristina Rea, a research scientist and group leader at the PSFC, will serve as the primary investigator for the newly funded three-year collaboration to pilot the integration of fusion data into a system that can be read by AI-powered tools. The PSFC, together with scientists from William & Mary, the University of Wisconsin at Madison, Auburn University, and the nonprofit HDF Group, plan to create a holistic fusion data platform, the elements of which could offer unprecedented access for researchers, especially underrepresented students. The project aims to encourage diverse participation in fusion and data science, both in academia and the workforce, through outreach programs led by the group’s co-investigators, of whom four out of five are women.
The DoE’s award, part of a $29 million funding package for seven projects across 19 institutions, will support the group’s efforts to distribute data produced by fusion devices like the PSFC’s Alcator C-Mod, a donut-shaped “tokamak” that utilized powerful magnets to control and confine fusion reactions. Alcator C-Mod operated from 1991 to 2016 and its data are still being studied, thanks in part to the PSFC’s commitment to the free exchange of knowledge.
Currently, there are nearly 50 public experimental magnetic confinement-type fusion devices; however, both historical and current data from these devices can be difficult to access. Some fusion databases require signing user agreements, and not all data are catalogued and organized the same way. Moreover, it can be difficult to leverage machine learning, a class of AI tools, for data analysis and to enable scientific discovery without time-consuming data reorganization. The result is fewer scientists working on fusion, greater barriers to discovery, and a bottleneck in harnessing AI to accelerate progress.
The project’s proposed data platform addresses technical barriers by being FAIR — Findable, Interoperable, Accessible, Reusable — and by adhering to UNESCO’s Open Science (OS) recommendations to improve the transparency and inclusivity of science; all of the researchers’ deliverables will adhere to FAIR and OS principles, as required by the DoE. The platform’s databases will be built using MDSplusML, an upgraded version of the MDSplus open-source software developed by PSFC researchers in the 1980s to catalogue the results of Alcator C-Mod’s experiments. Today, nearly 40 fusion research institutes use MDSplus to store and provide external access to their fusion data. The release of MDSplusML aims to continue that legacy of open collaboration.
The researchers intend to address barriers to participation for women and disadvantaged groups not only by improving general access to fusion data, but also through a subsidized summer school that will focus on topics at the intersection of fusion and machine learning, which will be held at William & Mary for the next three years.
Of the importance of their research, Rea says, “This project is about responding to the fusion community’s needs and setting ourselves up for success. Scientific advancements in fusion are enabled via multidisciplinary collaboration and cross-pollination, so accessibility is absolutely essential. I think we all understand now that diverse communities have more diverse ideas, and they allow faster problem-solving.”
The collaboration’s work also aligns with vital areas of research identified in the International Atomic Energy Agency’s “AI for Fusion” Coordinated Research Project (CRP). Rea was selected as the technical coordinator for the IAEA’s CRP emphasizing community engagement and knowledge access to accelerate fusion research and development. In a letter of support written for the group’s proposed project, the IAEA stated that, “the work [the researchers] will carry out […] will be beneficial not only to our CRP but also to the international fusion community in large.”
PSFC Director and Hitachi America Professor of Engineering Dennis Whyte adds, “I am thrilled to see PSFC and our collaborators be at the forefront of applying new AI tools while simultaneously encouraging and enabling extraction of critical data from our experiments.”
“Having the opportunity to lead such an important project is extremely meaningful, and I feel a responsibility to show that women are leaders in STEM,” says Rea. “We have an incredible team, strongly motivated to improve our fusion ecosystem and to contribute to making fusion energy a reality.” More
125 Shares169 Views
in Data Management & Statistics
System tracks movement of food through global humanitarian supply chain
by Markus Andrews 17 July 2023, 19:40
Although more than enough food is produced to feed everyone in the world, as many as 828 million people face hunger today. Poverty, social inequity, climate change, natural disasters, and political conflicts all contribute to inhibiting access to food. For decades, the U.S. Agency for International Development (USAID) Bureau for Humanitarian Assistance (BHA) has been a leader in global food assistance, supplying millions of metric tons of food to recipients worldwide. Alleviating hunger — and the conflict and instability hunger causes — is critical to U.S. national security.
But BHA is only one player within a large, complex supply chain in which food gets handed off between more than 100 partner organizations before reaching its final destination. Traditionally, the movement of food through the supply chain has been a black-box operation, with stakeholders largely out of the loop about what happens to the food once it leaves their custody. This lack of direct visibility into operations is due to siloed data repositories, insufficient data sharing among stakeholders, and different data formats that operators must manually sort through and standardize. As a result, accurate, real-time information — such as where food shipments are at any given time, which shipments are affected by delays or food recalls, and when shipments have arrived at their final destination — is lacking. A centralized system capable of tracing food along its entire journey, from manufacture through delivery, would enable a more effective humanitarian response to food-aid needs.
In 2020, a team from MIT Lincoln Laboratory began engaging with BHA to create an intelligent dashboard for their supply-chain operations. This dashboard brings together the expansive food-aid datasets from BHA’s existing systems into a single platform, with tools for visualizing and analyzing the data. When the team started developing the dashboard, they quickly realized the need for considerably more data than BHA had access to.
“That’s where traceability comes in, with each handoff partner contributing key pieces of information as food moves through the supply chain,” explains Megan Richardson, a researcher in the laboratory’s Humanitarian Assistance and Disaster Relief Systems Group.
Richardson and the rest of the team have been working with BHA and their partners to scope, build, and implement such an end-to-end traceability system. This system consists of serialized, unique identifiers (IDs) — akin to fingerprints — that are assigned to individual food items at the time they are produced. These individual IDs remain linked to items as they are aggregated along the supply chain, first domestically and then internationally. For example, individually tagged cans of vegetable oil get packaged into cartons; cartons are placed onto pallets and transported via railway and truck to warehouses; pallets are loaded onto shipping containers at U.S. ports; and pallets are unloaded and cartons are unpackaged overseas.
With a trace
Today, visibility at the single-item level doesn’t exist. Most suppliers mark pallets with a lot number (a lot is a batch of items produced in the same run), but this is for internal purposes (i.e., to track issues stemming back to their production supply, like over-enriched ingredients or machinery malfunction), not data sharing. So, organizations know which supplier lot a pallet and carton are associated with, but they can’t track the unique history of an individual carton or item within that pallet. As the lots move further downstream toward their final destination, they are often mixed with lots from other productions, and possibly other commodity types altogether, because of space constraints. On the international side, such mixing and the lack of granularity make it difficult to quickly pull commodities out of the supply chain if food safety concerns arise. Current response times can span several months.
“Commodities are grouped differently at different stages of the supply chain, so it is logical to track them in those groupings where needed,” Richardson says. “Our item-level granularity serves as a form of Rosetta Stone to enable stakeholders to efficiently communicate throughout these stages. We’re trying to enable a way to track not only the movement of commodities, including through their lot information, but also any problems arising independent of lot, like exposure to high humidity levels in a warehouse. Right now, we have no way to associate commodities with histories that may have resulted in an issue.”
“You can now track your checked luggage across the world and the fish on your dinner plate,” adds Brice MacLaren, also a researcher in the laboratory’s Humanitarian Assistance and Disaster Relief Systems Group. “So, this technology isn’t new, but it’s new to BHA as they evolve their methodology for commodity tracing. The traceability system needs to be versatile, working across a wide variety of operators who take custody of the commodity along the supply chain and fitting into their existing best practices.”
As food products make their way through the supply chain, operators at each receiving point would be able to scan these IDs via a Lincoln Laboratory-developed mobile application (app) to indicate a product’s current location and transaction status — for example, that it is en route on a particular shipping container or stored in a certain warehouse. This information would get uploaded to a secure traceability server. By scanning a product, operators would also see its history up until that point.
Hitting the mark
At the laboratory, the team tested the feasibility of their traceability technology, exploring different ways to mark and scan items. In their testing, they considered barcodes and radio-frequency identification (RFID) tags and handheld and fixed scanners. Their analysis revealed 2D barcodes (specifically data matrices) and smartphone-based scanners were the most feasible options in terms of how the technology works and how it fits into existing operations and infrastructure.
“We needed to come up with a solution that would be practical and sustainable in the field,” MacLaren says. “While scanners can automatically read any RFID tags in close proximity as someone is walking by, they can’t discriminate exactly where the tags are coming from. RFID is expensive, and it’s hard to read commodities in bulk. On the other hand, a phone can scan a barcode on a particular box and tell you that code goes with that box. The challenge then becomes figuring out how to present the codes for people to easily scan without significantly interrupting their usual processes for handling and moving commodities.”
As the team learned from partner representatives in Kenya and Djibouti, offloading at the ports is a chaotic, fast operation. At manual warehouses, porters fling bags over their shoulders or stack cartons atop their heads any which way they can and run them to a drop point; at bagging terminals, commodities come down a conveyor belt and land this way or that way. With this variability comes several questions: How many barcodes do you need on an item? Where should they be placed? What size should they be? What will they cost? The laboratory team is considering these questions, keeping in mind that the answers will vary depending on the type of commodity; vegetable oil cartons will have different specifications than, say, 50-kilogram bags of wheat or peas.
Leaving a mark
Leveraging results from their testing and insights from international partners, the team has been running a traceability pilot evaluating how their proposed system meshes with real-world domestic and international operations. The current pilot features a domestic component in Houston, Texas, and an international component in Ethiopia, and focuses on tracking individual cartons of vegetable oil and identifying damaged cans. The Ethiopian team with Catholic Relief Services recently received a container filled with pallets of uniquely barcoded cartons of vegetable oil cans (in the next pilot, the cans will be barcoded, too). They are now scanning items and collecting data on product damage by using smartphones with the laboratory-developed mobile traceability app on which they were trained.
“The partners in Ethiopia are comparing a couple lid types to determine whether some are more resilient than others,” Richardson says. “With the app — which is designed to scan commodities, collect transaction data, and keep history — the partners can take pictures of damaged cans and see if a trend with the lid type emerges.”
Next, the team will run a series of pilots with the World Food Program (WFP), the world’s largest humanitarian organization. The first pilot will focus on data connectivity and interoperability, and the team will engage with suppliers to directly print barcodes on individual commodities instead of applying barcode labels to packaging, as they did in the initial feasibility testing. The WFP will provide input on which of their operations are best suited for testing the traceability system, considering factors like the network bandwidth of WFP staff and local partners, the commodity types being distributed, and the country context for scanning. The BHA will likely also prioritize locations for system testing.
“Our goal is to provide an infrastructure to enable as close to real-time data exchange as possible between all parties, given intermittent power and connectivity in these environments,” MacLaren says.
In subsequent pilots, the team will try to integrate their approach with existing systems that partners rely on for tracking procurements, inventory, and movement of commodities under their custody so that this information is automatically pushed to the traceability server. The team also hopes to add a capability for real-time alerting of statuses, like the departure and arrival of commodities at a port or the exposure of unclaimed commodities to the elements. Real-time alerts would enable stakeholders to more efficiently respond to food-safety events. Currently, partners are forced to take a conservative approach, pulling out more commodities from the supply chain than are actually suspect, to reduce risk of harm. Both BHA and WHP are interested in testing out a food-safety event during one of the pilots to see how the traceability system works in enabling rapid communication response.
To implement this technology at scale will require some standardization for marking different commodity types as well as give and take among the partners on best practices for handling commodities. It will also require an understanding of country regulations and partner interactions with subcontractors, government entities, and other stakeholders.
“Within several years, I think it’s possible for BHA to use our system to mark and trace all their food procured in the United States and sent internationally,” MacLaren says.
Once collected, the trove of traceability data could be harnessed for other purposes, among them analyzing historical trends, predicting future demand, and assessing the carbon footprint of commodity transport. In the future, a similar traceability system could scale for nonfood items, including medical supplies distributed to disaster victims, resources like generators and water trucks localized in emergency-response scenarios, and vaccines administered during pandemics. Several groups at the laboratory are also interested in such a system to track items such as tools deployed in space or equipment people carry through different operational environments.
“When we first started this program, colleagues were asking why the laboratory was involved in simple tasks like making a dashboard, marking items with barcodes, and using hand scanners,” MacLaren says. “Our impact here isn’t about the technology; it’s about providing a strategy for coordinated food-aid response and successfully implementing that strategy. Most importantly, it’s about people getting fed.” More
75 Shares99 Views
in Data Management & Statistics
Making sense of all things data
by Markus Andrews 13 July 2023, 13:00
Data, and more specifically using data, is not a new concept, but it remains an elusive one. It comes with terms like “the internet of things” (IoT) and “the cloud,” and no matter how often those are explained, smart people can still be confused. And then there’s the amount of information available and the speed with which it comes in. Software is omnipresent. It’s in coffeemakers and watches, gathering data every second. The question becomes how to take all the new technology and take advantage of the potential insights and analytics. It’s not a small ask.
“Putting our arms around what digital transformation is can be difficult to do,” says Abel Sanchez. But as the executive director and research director of MIT’s Geospatial Data Center, that’s exactly what he does with his work in helping industries and executives shift their operations in order to make sense of their data and be able to use it to help their bottom lines.
Play video
Handling the pace
Data can lead to making better business decisions. That’s not a new or surprising insight, but as Sanchez says, people still tend to work off of intuition. Part of the problem is that they don’t know what to do with their available data, and there’s usually plenty of available data. Part of that problem is that there’s so much information being produced from so many sources. As soon as a person wakes up and turns on their phone or starts their car, software is running. It’s coming in fast, but because it’s also complex, “it outperforms people,” he says.
As an example with Uber, once a person clicks on the app for a ride, predictive models start firing at the rate of 1 million per second. It’s all in order to optimize the trip, taking into account factors such as school schedules, roadway conditions, traffic, and a driver’s availability. It’s helpful for the task, but it’s something that “no human would be able to do,” he says.
The solution requires a few components. One is a new way to store data. In the past, the classic was creating the “perfect library,” which was too structured. The response to that was to create a “data lake,” where all the information would go in and somehow people would make sense of it. “This also failed,” Sanchez says.
Data storage needs to be re-imaged, in which a key element is greater accessibility. In most corporations, only 10-20 percent of employees have the access and technical skill to work with the data. The rest have to go through a centralized resource and get into a queue, an inefficient system. The goal, Sanchez says, is to democratize the information by going to a modern stack, which would convert what he calls “dormant data” into “active data.” The result? Better decisions could be made.
The first, big step companies need to take is the will to make the change. Part of it is an investment of money, but it’s also an attitude shift. Corporations can have an embedded culture where things have always been done a certain way and deviating from that is resisted because it’s different. But when it comes to data, a new approach is needed. Managing and curating the information can no longer rest in the hands of one person with the institutional memory. It’s not possible. It’s also not practical because companies are losing out on efficiency and productivity, because with technology, “What use to take years to do, now you can do in days,” Sanchez says.
Play video
The new player
The above exemplifies what’s been involved with coordinating data along four intertwined components: IoT, AI, the cloud, and security. The first two create the information, which then gets stored in the cloud, but it’s all for naught without robust security. But one relative newcomer has come into the picture. It’s blockchain technology, a term that is often said but still not fully understood, adding further to the confusion.
Sanchez says that information has been handled and organized a certain way with the World Wide Web. Blockchain is an opportunity to be more nimble and productive by offering the chance to have an accepted identity, currency, and logic that works on a global scale. The holdup has always been that there’s never been any agreement on those three components on a global scale. It leads to people being shut out, inefficiency, and lost business.
One example, Sanchez says, of blockchain’s potential is with hospitals. In the United States, they’re private and information has to be constantly integrated from doctors, insurance companies, labs, government regulators, and pharmaceutical companies. It leads to repeated steps to do something as simple as recognizing a patient’s identity, which often can’t be agreed upon. With blockchain, these various entities can create a consortium using open source code with no barriers of access, and it could quickly and easily identify a patient because it set up an agreement, and with it “remove that level of effort.” It’s an incremental step, but one which can be built upon that reduces cost and risk.
Another example — “one of the best examples,” Sanchez says — is what was done in Indonesia. Most of the rice, corn, and wheat that comes from this area is produced from smallholder farms. For the people making loans, it’s expensive to understand the risk of cultivating these plots of land. Compounding that is that these farmers don’t have state-issued identities or credit records, so, “They don’t exist in the modern economic sense,” he says. They don’t have access to loans, and banks are losing out on potential good customers.
With this project, blockchain allowed local people to gather information about the farms on their smartphones. Banks could acquire the information and compensate the people with tokens, thereby incentivizing the work. The bank would see the creditworthiness of the farms, and farmers could end up getting fair loans.
In the end, it creates a beneficial circle for the banks, farmers, and community, but it also represents what can be done with digital transformation by allowing businesses to optimize their processes, make better decisions, and ultimately profit.
“It’s a tremendous new platform,” Sanchez says. “This is the promise.” More
200 Shares189 Views
in Data Management & Statistics
Report: CHIPS Act just the first step in addressing threats to US leadership in advanced computing
by Markus Andrews 28 February 2023, 17:00
When Liu He, a Chinese economist, politician, and “chip czar,” was tapped to lead the charge in a chipmaking arms race with the United States, his message lingered in the air, leaving behind a dewy glaze of tension: “For our country, technology is not just for growth… it is a matter of survival.”
Once upon a time, the United States’ early technological prowess positioned the nation to outpace foreign rivals and cultivate a competitive advantage for domestic businesses. Yet, 30 years later, America’s lead in advanced computing is continuing to wane. What happened?
A new report from an MIT researcher and two colleagues sheds light on the decline in U.S. leadership. The scientists looked at high-level measures to examine the shrinkage: overall capabilities, supercomputers, applied algorithms, and semiconductor manufacturing. Through their analysis, they found that not only has China closed the computing gap with the U.S., but nearly 80 percent of American leaders in the field believe that their Chinese competitors are improving capabilities faster — which, the team says, suggests a “broad threat to U.S. competitiveness.”
To delve deeply into the fray, the scientists conducted the Advanced Computing Users Survey, sampling 120 top-tier organizations, including universities, national labs, federal agencies, and industry. The team estimates that this group comprises one-third and one-half of all the most significant computing users in the United States.
“Advanced computing is crucial to scientific improvement, economic growth and the competitiveness of U.S. companies,” says Neil Thompson, director of the FutureTech Research Project at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), who helped lead the study.
Thompson, who is also a principal investigator at MIT’s Initiative on the Digital Economy, wrote the paper with Chad Evans, executive vice president and secretary and treasurer to the board at the Council on Competitiveness, and Daniel Armbrust, who is the co-founder, initial CEO, and member of the board of directors at Silicon Catalyst and former president of SEMATECH, the semiconductor consortium that developed industry roadmaps.
The semiconductor, supercomputer, and algorithm bonanza
Supercomputers — the room-sized, “giant calculators” of the hardware world — are an industry no longer dominated by the United States. Through 2015, about half of the most powerful computers were sitting firmly in the U.S., and China was growing slowly from a very slow base. But in the past six years, China has swiftly caught up, reaching near parity with America.
This disappearing lead matters. Eighty-four percent of U.S. survey respondents said they’re computationally constrained in running essential programs. “This result was telling, given who our respondents are: the vanguard of American research enterprises and academic institutions with privileged access to advanced national supercomputing resources,” says Thompson.
With regards to advanced algorithms, historically, the U.S. has fronted the charge, with two-thirds of all significant improvements dominated by U.S.-born inventors. But in recent decades, U.S. dominance in algorithms has relied on bringing in foreign talent to work in the U.S., which the researchers say is now in jeopardy. China has outpaced the U.S. and many other countries in churning out PhDs in STEM fields since 2007, with one report postulating a near-distant future (2025) where China will be home to nearly twice as many PhDs than in the U.S. China’s rise in algorithms can also be seen with the “Gordon Bell Prize,” an achievement for outstanding work in harnessing the power of supercomputers in varied applications. U.S. winners historically dominated the prize, but China has now equaled or surpassed Americans’ performance in the past five years.
While the researchers note the CHIPS and Science Act of 2022 is a critical step in re-establishing the foundation of success for advanced computing, they propose recommendations to the U.S. Office of Science and Technology Policy.
First, they suggest democratizing access to U.S. supercomputing by building more mid-tier systems that push boundaries for many users, as well as building tools so users scaling up computations can have less up-front resource investment. They also recommend increasing the pool of innovators by funding many more electrical engineers and computer scientists being trained with longer-term US residency incentives and scholarships. Finally, in addition to this new framework, the scientists urge taking advantage of what already exists, via providing the private sector access to experimentation with high-performance computing through supercomputing sites in academia and national labs.
All that and a bag of chips
Computing improvements depend on continuous advances in transistor density and performance, but creating robust, new chips necessitate a harmonious blend of design and manufacturing.
Over the last six years, China was not known as the savants of noteworthy chips. In fact, in the past five decades, the U.S. designed most of them. But this changed in the past six years when China created the HiSilicon Kirin 9000, propelling itself to the international frontier. This success was mainly obtained through partnerships with leading global chip designers that began in the 2000s. Now, China now has 14 companies among the world’s top 50 fabless designers. A decade ago, there was only one.
Competitive semiconductor manufacturing has been more mixed, where U.S.-led policies and internal execution issues have slowed China’s rise, but as of July 2022, the Semiconductor Manufacturing International Corporation (SMIC) has evidence of 7 nanometer logic, which was not expected until much later. However, with extreme ultraviolet export restrictions, progress below 7 nm means domestic technology development would be expensive. Currently, China is only at parity or better in two out of 12 segments of the semiconductor supply chain. Still, with government policy and investments, the team expects a whopping increase to seven segments in 10 years. So, for the moment, the U.S. retains leadership in hardware manufacturing, but with fewer dimensions of advantage.
The authors recommend that the White House Office of Science and Technology Policy work with key national agencies, such as the U.S. Department of Defense, U.S. Department of Energy, and the National Science Foundation, to define initiatives to build the hardware and software systems needed for important computing paradigms and workloads critical for economic and security goals. “It is crucial that American enterprises can get the benefit of faster computers,” says Thompson. “With Moore’s Law slowing down, the best way to do this is to create a portfolio of specialized chips (or “accelerators”) that are customized to our needs.”
The scientists further believe that to lead the next generation of computing, four areas must be addressed. First, by issuing grand challenges to the CHIPS Act National Semiconductor Technology Center, researchers and startups would be motivated to invest in research and development and to seek startup capital for new technologies in areas such as spintronics, neuromorphics, optical and quantum computing, and optical interconnect fabrics. By supporting allies in passing similar acts, overall investment in these technologies would increase, and supply chains would become more aligned and secure. Establishing test beds for researchers to test algorithms on new computing architectures and hardware would provide an essential platform for innovation and discovery. Finally, planning for post-exascale systems that achieve higher levels of performance through next-generation advances would ensure that current commercial technologies don’t limit future computing systems.
“The advanced computing landscape is in rapid flux — technologically, economically, and politically, with both new opportunities for innovation and rising global rivalries,” says Daniel Reed, Presidential Professor and professor of computer science and electrical and computer engineering at the University of Utah. “The transformational insights from both deep learning and computational modeling depend on both continued semiconductor advances and their instantiation in leading edge, large-scale computing systems — hyperscale clouds and high-performance computing systems. Although the U.S. has historically led the world in both advanced semiconductors and high-performance computing, other nations have recognized that these capabilities are integral to 21st century economic competitiveness and national security, and they are investing heavily.”
The research was funded, in part, through Thompson’s grant from Good Ventures, which supports his FutureTech Research Group. The paper is being published by the Georgetown Public Policy Review. More
100 Shares109 Views
in Data Management & Statistics
Researchers release open-source photorealistic simulator for autonomous driving
by Markus Andrews 21 June 2022, 15:00
Hyper-realistic virtual worlds have been heralded as the best driving schools for autonomous vehicles (AVs), since they’ve proven fruitful test beds for safely trying out dangerous driving scenarios. Tesla, Waymo, and other self-driving companies all rely heavily on data to enable expensive and proprietary photorealistic simulators, since testing and gathering nuanced I-almost-crashed data usually isn’t the most easy or desirable to recreate.
To that end, scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) created “VISTA 2.0,” a data-driven simulation engine where vehicles can learn to drive in the real world and recover from near-crash scenarios. What’s more, all of the code is being open-sourced to the public.
“Today, only companies have software like the type of simulation environments and capabilities of VISTA 2.0, and this software is proprietary. With this release, the research community will have access to a powerful new tool for accelerating the research and development of adaptive robust control for autonomous driving,” says MIT Professor and CSAIL Director Daniela Rus, senior author on a paper about the research.
Play video
VISTA is a data-driven, photorealistic simulator for autonomous driving. It can simulate not just live video but LiDAR data and event cameras, and also incorporate other simulated vehicles to model complex driving situations. VISTA is open source and the code can be found below.
VISTA 2.0 builds off of the team’s previous model, VISTA, and it’s fundamentally different from existing AV simulators since it’s data-driven — meaning it was built and photorealistically rendered from real-world data — thereby enabling direct transfer to reality. While the initial iteration supported only single car lane-following with one camera sensor, achieving high-fidelity data-driven simulation required rethinking the foundations of how different sensors and behavioral interactions can be synthesized.
Enter VISTA 2.0: a data-driven system that can simulate complex sensor types and massively interactive scenarios and intersections at scale. With much less data than previous models, the team was able to train autonomous vehicles that could be substantially more robust than those trained on large amounts of real-world data.
“This is a massive jump in capabilities of data-driven simulation for autonomous vehicles, as well as the increase of scale and ability to handle greater driving complexity,” says Alexander Amini, CSAIL PhD student and co-lead author on two new papers, together with fellow PhD student Tsun-Hsuan Wang. “VISTA 2.0 demonstrates the ability to simulate sensor data far beyond 2D RGB cameras, but also extremely high dimensional 3D lidars with millions of points, irregularly timed event-based cameras, and even interactive and dynamic scenarios with other vehicles as well.”
The team was able to scale the complexity of the interactive driving tasks for things like overtaking, following, and negotiating, including multiagent scenarios in highly photorealistic environments.
Training AI models for autonomous vehicles involves hard-to-secure fodder of different varieties of edge cases and strange, dangerous scenarios, because most of our data (thankfully) is just run-of-the-mill, day-to-day driving. Logically, we can’t just crash into other cars just to teach a neural network how to not crash into other cars.
Recently, there’s been a shift away from more classic, human-designed simulation environments to those built up from real-world data. The latter have immense photorealism, but the former can easily model virtual cameras and lidars. With this paradigm shift, a key question has emerged: Can the richness and complexity of all of the sensors that autonomous vehicles need, such as lidar and event-based cameras that are more sparse, accurately be synthesized?
Lidar sensor data is much harder to interpret in a data-driven world — you’re effectively trying to generate brand-new 3D point clouds with millions of points, only from sparse views of the world. To synthesize 3D lidar point clouds, the team used the data that the car collected, projected it into a 3D space coming from the lidar data, and then let a new virtual vehicle drive around locally from where that original vehicle was. Finally, they projected all of that sensory information back into the frame of view of this new virtual vehicle, with the help of neural networks.
Together with the simulation of event-based cameras, which operate at speeds greater than thousands of events per second, the simulator was capable of not only simulating this multimodal information, but also doing so all in real time — making it possible to train neural nets offline, but also test online on the car in augmented reality setups for safe evaluations. “The question of if multisensor simulation at this scale of complexity and photorealism was possible in the realm of data-driven simulation was very much an open question,” says Amini.
With that, the driving school becomes a party. In the simulation, you can move around, have different types of controllers, simulate different types of events, create interactive scenarios, and just drop in brand new vehicles that weren’t even in the original data. They tested for lane following, lane turning, car following, and more dicey scenarios like static and dynamic overtaking (seeing obstacles and moving around so you don’t collide). With the multi-agency, both real and simulated agents interact, and new agents can be dropped into the scene and controlled any which way.
Taking their full-scale car out into the “wild” — a.k.a. Devens, Massachusetts — the team saw immediate transferability of results, with both failures and successes. They were also able to demonstrate the bodacious, magic word of self-driving car models: “robust.” They showed that AVs, trained entirely in VISTA 2.0, were so robust in the real world that they could handle that elusive tail of challenging failures.
Now, one guardrail humans rely on that can’t yet be simulated is human emotion. It’s the friendly wave, nod, or blinker switch of acknowledgement, which are the type of nuances the team wants to implement in future work.
“The central algorithm of this research is how we can take a dataset and build a completely synthetic world for learning and autonomy,” says Amini. “It’s a platform that I believe one day could extend in many different axes across robotics. Not just autonomous driving, but many areas that rely on vision and complex behaviors. We’re excited to release VISTA 2.0 to help enable the community to collect their own datasets and convert them into virtual worlds where they can directly simulate their own virtual autonomous vehicles, drive around these virtual terrains, train autonomous vehicles in these worlds, and then can directly transfer them to full-sized, real self-driving cars.”
Amini and Wang wrote the paper alongside Zhijian Liu, MIT CSAIL PhD student; Igor Gilitschenski, assistant professor in computer science at the University of Toronto; Wilko Schwarting, AI research scientist and MIT CSAIL PhD ’20; Song Han, associate professor at MIT’s Department of Electrical Engineering and Computer Science; Sertac Karaman, associate professor of aeronautics and astronautics at MIT; and Daniela Rus, MIT professor and CSAIL director. The researchers presented the work at the IEEE International Conference on Robotics and Automation (ICRA) in Philadelphia.
This work was supported by the National Science Foundation and Toyota Research Institute. The team acknowledges the support of NVIDIA with the donation of the Drive AGX Pegasus. More

Software

Latest story

New software enables blind and low-vision users to create interactive, accessible charts

More stories

“We offer another place for knowledge”

Generating opportunities with generative AI

Improving accessibility of online graphics for blind users

Fast-tracking fusion energy’s arrival with AI and accessibility

System tracks movement of food through global humanitarian supply chain

Making sense of all things data

Report: CHIPS Act just the first step in addressing threats to US leadership in advanced computing

Researchers release open-source photorealistic simulator for autonomous driving

ITALIAN LANGUAGE

ENGLISH LANGUAGE