More stories

  • in

    More transparency and understanding into machine behaviors

    Explaining, interpreting, and understanding the human mind presents a unique set of challenges. 

    Doing the same for the behaviors of machines, meanwhile, is a whole other story. 

    As artificial intelligence (AI) models are increasingly used in complex situations — approving or denying loans, helping doctors with medical diagnoses, assisting drivers on the road, or even taking complete control — humans still lack a holistic understanding of their capabilities and behaviors. 

    Existing research focuses mainly on the basics: How accurate is this model? Oftentimes, centering on the notion of simple accuracy can lead to dangerous oversights. What if the model makes mistakes with very high confidence? How would the model behave if it encountered something previously unseen, such as a self-driving car seeing a new type of traffic sign?

    In the quest for better human-AI interaction, a team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have created a new tool called Bayes-TrEx that allows developers and users to gain transparency into their AI model. Specifically, it does so by finding concrete examples that lead to a particular behavior. The method makes use of  “Bayesian posterior inference,” a widely-used mathematical framework to reason about model uncertainty.

    In experiments, the researchers applied Bayes-TrEx to several image-based datasets, and found new insights that were previously overlooked by standard evaluations focusing solely on prediction accuracy. 

    “Such analyses are important to verify that the model is indeed functioning correctly in all cases,” says MIT CSAIL PhD student Yilun Zhou, co-lead researcher on Bayes-TrEx. “An especially alarming situation is when the model is making mistakes, but with very high confidence. Due to high user trust over the high reported confidence, these mistakes might fly under the radar for a long time and only get discovered after causing extensive damage.”

    For example, after a medical diagnosis system finishes learning on a set of X-ray images, a doctor can use Bayes-TrEx to find images that the model misclassified with very high confidence, to ensure that it doesn’t miss any particular variant of a disease. 

    Bayes-TrEx can also help with understanding model behaviors in novel situations. Take autonomous driving systems, which often rely on camera images to take in traffic lights, bike lanes, and obstacles. These common occurrences can be easily recognized with high accuracy by the camera, but more complicated situations can provide literal and metaphorical roadblocks. A zippy Segway could potentially be interpreted as something as big as a car or as small as a bump on the road, leading to a tricky turn or costly collision. Bayes-TrEx could help address these novel situations ahead of time, and enable developers to correct any undesirable outcomes before potential tragedies occur. 

    In addition to images, the researchers are also tackling a less-static domain: robots. Their tool, called “RoCUS”, inspired by Bayes-TrEx, uses additional adaptations to analyze robot-specific behaviors. 

    While still in a testing phase, experiments with RoCUS point to new discoveries that could be easily missed if the evaluation was focused solely on task completion. For example, a 2D navigation robot that used a deep learning approach preferred to navigate tightly around obstacles, due to how the training data was collected. Such a preference, however, could be risky if the robot’s obstacle sensors are not fully accurate. For a robot arm reaching a target on a table, the asymmetry in the robot’s kinematic structure showed larger implications on its ability to reach targets on the left versus the right.

    “We want to make human-AI interaction safer by giving humans more insight into their AI collaborators,” says MIT CSAIL PhD student Serena Booth, co-lead author with Zhou. “Humans should be able to understand how these agents make decisions, to predict how they will act in the world, and — most critically — to anticipate and circumvent failures.”  

    Booth and Zhou are coauthors on the Bayes-TrEx work alongside MIT CSAIL PhD student Ankit Shah and MIT Professor Julie Shah. They presented the paper virtually at the AAAI conference on Artificial Intelligence. Along with Booth, Zhou, and Shah, MIT CSAIL postdoc Nadia Figueroa Fernandez has contributed work on the RoCUS tool.  More

  • in

    Researchers’ algorithm designs soft robots that sense

    There are some tasks that traditional robots — the rigid and metallic kind — simply aren’t cut out for. Soft-bodied robots, on the other hand, may be able to interact with people more safely or slip into tight spaces with ease. But for robots to reliably complete their programmed duties, they need to know the whereabouts of all their body parts. That’s a tall task for a soft robot that can deform in a virtually infinite number of ways.

    MIT researchers have developed an algorithm to help engineers design soft robots that collect more useful information about their surroundings. The deep-learning algorithm suggests an optimized placement of sensors within the robot’s body, allowing it to better interact with its environment and complete assigned tasks. The advance is a step toward the automation of robot design. “The system not only learns a given task, but also how to best design the robot to solve that task,” says Alexander Amini. “Sensor placement is a very difficult problem to solve. So, having this solution is extremely exciting.”

    The research will be presented during April’s IEEE International Conference on Soft Robotics and will be published in the journal IEEE Robotics and Automation Letters. Co-lead authors are Amini and Andrew Spielberg, both PhD students in MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). Other co-authors include MIT PhD student Lillian Chin, and professors Wojciech Matusik and Daniela Rus.

    Play video

    Creating soft robots that complete real-world tasks has been a long-running challenge in robotics. Their rigid counterparts have a built-in advantage: a limited range of motion. Rigid robots’ finite array of joints and limbs usually makes for manageable calculations by the algorithms that control mapping and motion planning. Soft robots are not so tractable.

    Soft-bodied robots are flexible and pliant — they generally feel more like a bouncy ball than a bowling ball. “The main problem with soft robots is that they are infinitely dimensional,” says Spielberg. “Any point on a soft-bodied robot can, in theory, deform in any way possible.” That makes it tough to design a soft robot that can map the location of its body parts. Past efforts have used an external camera to chart the robot’s position and feed that information back into the robot’s control program. But the researchers wanted to create a soft robot untethered from external aid.

    “You can’t put an infinite number of sensors on the robot itself,” says Spielberg. “So, the question is: How many sensors do you have, and where do you put those sensors in order to get the most bang for your buck?” The team turned to deep learning for an answer.

    The researchers developed a novel neural network architecture that both optimizes sensor placement and learns to efficiently complete tasks. First, the researchers divided the robot’s body into regions called “particles.” Each particle’s rate of strain was provided as an input to the neural network. Through a process of trial and error, the network “learns” the most efficient sequence of movements to complete tasks, like gripping objects of different sizes. At the same time, the network keeps track of which particles are used most often, and it culls the lesser-used particles from the set of inputs for the networks’ subsequent trials.

    By optimizing the most important particles, the network also suggests where sensors should be placed on the robot to ensure efficient performance. For example, in a simulated robot with a grasping hand, the algorithm might suggest that sensors be concentrated in and around the fingers, where precisely controlled interactions with the environment are vital to the robot’s ability to manipulate objects. While that may seem obvious, it turns out the algorithm vastly outperformed humans’ intuition on where to site the sensors.

    The researchers pitted their algorithm against a series of expert predictions. For three different soft robot layouts, the team asked roboticists to manually select where sensors should be placed to enable the efficient completion of tasks like grasping various objects. Then they ran simulations comparing the human-sensorized robots to the algorithm-sensorized robots. And the results weren’t close. “Our model vastly outperformed humans for each task, even though I looked at some of the robot bodies and felt very confident on where the sensors should go,” says Amini. “It turns out there are a lot more subtleties in this problem than we initially expected.”

    Spielberg says their work could help to automate the process of robot design. In addition to developing algorithms to control a robot’s movements, “we also need to think about how we’re going to sensorize these robots, and how that will interplay with other components of that system,” he says. And better sensor placement could have industrial applications, especially where robots are used for fine tasks like gripping. “That’s something where you need a very robust, well-optimized sense of touch,” says Spielberg. “So, there’s potential for immediate impact.”

    “Automating the design of sensorized soft robots is an important step toward rapidly creating intelligent tools that help people with physical tasks,” says Rus. “The sensors are an important aspect of the process, as they enable the soft robot to “see” and understand the world and its relationship with the world.”

    This research was funded, in part, by the National Science Foundation and the Fannie and John Hertz Foundation. More

  • in

    System detects errors when medication is self-administered

    From swallowing pills to injecting insulin, patients frequently administer their own medication. But they don’t always get it right. Improper adherence to doctors’ orders is commonplace, accounting for thousands of deaths and billions of dollars in medical costs annually. MIT researchers have developed a system to reduce those numbers for some types of medications.

    The new technology pairs wireless sensing with artificial intelligence to determine when a patient is using an insulin pen or inhaler, and flags potential errors in the patient’s administration method. “Some past work reports that up to 70% of patients do not take their insulin as prescribed, and many patients do not use inhalers properly,” says Dina Katabi, the Andrew and Erna Viteri Professor at MIT, whose research group has developed the new solution. The researchers say the system, which can be installed in a home, could alert patients and caregivers to medication errors and potentially reduce unnecessary hospital visits.

    The research appears today in the journal Nature Medicine. The study’s lead authors are Mingmin Zhao, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and Kreshnik Hoti, a former visiting scientist at MIT and current faculty member at the University of Prishtina in Kosovo. Other co-authors include Hao Wang, a former CSAIL postdoc and current faculty member at Rutgers University, and Aniruddh Raghu, a CSAIL PhD student.

    Some common drugs entail intricate delivery mechanisms. “For example, insulin pens require priming to make sure there are no air bubbles inside. And after injection, you have to hold for 10 seconds,” says Zhao. “All those little steps are necessary to properly deliver the drug to its active site.” Each step also presents opportunity for errors, especially when there’s no pharmacist present to offer corrective tips. Patients might not even realize when they make a mistake — so Zhao’s team designed an automated system that can.

    Their system can be broken down into three broad steps. First, a sensor tracks a patient’s movements within a 10-meter radius, using radio waves that reflect off their body. Next, artificial intelligence scours the reflected signals for signs of a patient self-administering an inhaler or insulin pen. Finally, the system alerts the patient or their health care provider when it detects an error in the patient’s self-administration.

    The researchers adapted their sensing method from a wireless technology they’d previously used to monitor people’s sleeping positions. It starts with a wall-mounted device that emits very low-power radio waves. When someone moves, they modulate the signal and reflect it back to the device’s sensor. Each unique movement yields a corresponding pattern of modulated radio waves that the device can decode. “One nice thing about this system is that it doesn’t require the patient to wear any sensors,” says Zhao. “It can even work through occlusions, similar to how you can access your Wi-Fi when you’re in a different room from your router.”

    The new sensor sits in the background at home, like a Wi-Fi router, and uses artificial intelligence to interpret the modulated radio waves. The team developed a neural network to key in on patterns indicating the use of an inhaler or insulin pen. They trained the network to learn those patterns by performing example movements, some relevant (e.g. using an inhaler) and some not (e.g. eating). Through repetition and reinforcement, the network successfully detected 96 percent of insulin pen administrations and 99 percent of inhaler uses.

    Once it mastered the art of detection, the network also proved useful for correction. Every proper medicine administration follows a similar sequence — picking up the insulin pen, priming it, injecting, etc. So, the system can flag anomalies in any particular step. For example, the network can recognize if a patient holds down their insulin pen for five seconds instead of the prescribed 10 seconds. The system can then relay that information to the patient or directly to their doctor, so they can fix their technique.

    “By breaking it down into these steps, we can not only see how frequently the patient is using their device, but also assess their administration technique to see how well they’re doing,” says Zhao.

    The researchers say a key feature of their radio wave-based system is its noninvasiveness. “An alternative way to solve this problem is by installing cameras,” says Zhao. “But using a wireless signal is much less intrusive. It doesn’t show peoples’ appearance.”

    He adds that their framework could be adapted to medications beyond inhalers and insulin pens — all it would take is retraining the neural network to recognize the appropriate sequence of movements. Zhao says that “with this type of sensing technology at home, we could detect issues early on, so the person can see a doctor before the problem is exacerbated.” More

  • in

    MIT.nano courses bring hands-on experimentation to virtual participants

    Every minute, a person just sitting or standing without moving sheds 100,000 particles that are 500 nanometers or larger. Is that person exercising? Now it’s 10 million particles per minute, says Jorg Scholvin, assistant director of user services for Fab.nano.

    That’s why users of the MIT.nano cleanroom — which is controlled to have fewer than 100 such particles per cubic foot of air — wear full-body “bunnysuits” and other specialized garments to maintain the pristine environment required for nanoscale research.

    Scholvin shared this lesson on gowning up during a virtual series of tours of the facility that was one of five courses highlighting the breadth of MIT.nano’s capabilities, initiated during MIT’s Independent Activities Period (IAP). The courses, several of which will be offered again this semester, also included a live nanofabrication demo and virtual classes on 360-degree photography, biomechanics in everyday life, and storytelling for science and engineering communication.

    A glimpse at the guts of MIT.nano

    The three-part series of virtual tours brought 56 attendees inside MIT.nano’s facilities. With a camera on a rolling tripod and a little ingenuity, Scholvin led the Zoom attendees through the cleanroom, comparing the layout to that of a grocery store: the bays where researchers work are like the aisles where customers shop, and the chases with the back ends of the equipment like the product shelves.

    On the second day Scholvin was joined by Anna Osherov, assistant director for user services at Characterization.nano, for a tour of the nanoscale imaging suites located in MIT.nano’s basement level. Attendees learned about the planning that went into creating an ultra-quiet environment for the nano-characterization tools, including a vibration demo showing the importance of the plinth — a 50,000-pound slab of concrete balanced on a set of springs four feet above the ground to create a quiet, stable island for the ultrasensitive microscopes that sit on top.

    The final tour in the series, led by MIT.nano Assistant Director for Infrastructure Nick Menounos, took virtual attendees on a walkthrough of the non-public spaces that keep MIT.nano running. Attendees rolled through the mechanical penthouse, basement water preparation space, centralized gas delivery system, and a freight elevator big enough to carry equipment the size of 14,000 large pizzas.

    Following this series, Scholvin led a separate class on thin-film deposition, lithography, and etching processes at the micro- and nanoscale. Attendees followed along with Scholvin as he worked in the cleanroom to expose, develop, and etch screenshots of the Zoom workshop attendees — and “secret” messages from the class etched in letters less than one millimeter high — into a 100-nanometer thin gold film on a silicon wafer. “Seeing the actual fabrication processes really made the technology accessible,” said one attendee. “Getting this type of access to the lab and process has been a truly unique experience.”

    Melding the physical with the digital

    MIT.nano’s Immersion Lab found a different way to engage with the students in its courses — by sending the hands-on experience to them through the mail. For Creating, Editing, and Distributing 360 Photography — a course facilitated by Rus Gant, director of the Harvard Visualization Research and Teaching Laboratory, and Samantha Farrell, MIT.nano senior administrative assistant — MIT.nano loaned each participant a 360-degree camera, a Quest 2 virtual reality (VR) headset, and a monopod.

    The course began with an overview of virtual reality theory and technologies, along with a history lesson on immersive art and panoramic photography spanning from pre-Civil War era to present day. The class then transitioned to workshop-style, with students creating content in their own environments, from 360-degree nature photography to videos that place viewers into the film via VR headsets.

    “Having experience using tools including a 360 camera, Photoshop/Premiere Pro, and a VR headset has empowered me to pursue 360 projects on my own in the future,” said one participant. For example, virtual reality developer Luis Zanforlin, part of a team that received an MIT.nano Immersion Lab Gaming Program seed grant in 2020, created a film of himself hanging out with all his friends during Covid-19. Zanforlin used a Ricoh Theta V 360 camera and a monopod, along with editing software from Adobe Premiere, to create the video best viewed using an Oculus Quest headset.

    Play video

    This is me hanging out with my friends during COVID.

    Biomechanics in everyday life, another IAP offering organized by the Immersion Lab, explored human movement through cardio exercise, yoga, and meditation. The four-session course, co-sponsored by MIT’s Clinical Research Center and led by Praneeth Namburi, postdoc in the Research Laboratory of Electronics (RLE), used motion capture technology and wireless sensors to explain how simple tasks such as walking or jumping may improve human health and well-being.

    During sessions on yoga and breathing, students learned how virtual reality could improve awareness and coordination. In focusing on balance, they were introduced to recording muscle activity using electromyography. “When I think about movement now,” said one participant, “I will think beyond just copying the positions of people who move well and instead think about how they cycle through the energy process.”

    Play video

    Tracking muscle length during yoga warrior flow

    Storytelling at the nanoscale

    The final MIT.nano offering, nanoStories, was a workshop focused on building narratives using text, video, and interactive media to demystify science and nanotechnology. Guest speakers from the Boston Museum of Science, PBS NOVA, and MIT joined workshop instructors MIT.nano Director Vladimir Bulović, Research Scientist Annie Wang, and Samantha Farrell in discussions and exercises on crafting exciting and understandable presentations of nano-topics for a general audience. Students developed their own stories throughout the course, presenting final projects that enthralled with expositions of how snowflakes form and why pencils work, explaining the color perception of one’s eye, and demystifying the activity of semiconductors in solar cells.

    In addition to these five courses, MIT.nano collaborated with MIT’s Clinical Research Center (CRC), MIT Medical, and the Department of Mechanical Engineering to offer a three-module course on the basics of and resources for human subjects research, as well as technology for symptoms monitoring during the Covid-19 pandemic. Led by MIT.nano Associate Director Brian W. Anthony and CRC Director of Clinical Operations Catherine Ricciardi, the course explored how MIT researchers have deployed and developed physiological sensing technologies for Covid-19 research and highlighted the resources available through the CRC, the MIT.nano Immersion Lab, and MIT ecosystem partners.

    Visit MIT.nano to see more images and videos from the courses. More

  • in

    Faster drug discovery through machine learning

    Drugs can only work if they stick to their target proteins in the body. Assessing that stickiness is a key hurdle in the drug discovery and screening process. New research combining chemistry and machine learning could lower that hurdle.
    The new technique, dubbed DeepBAR, quickly calculates the binding affinities between drug candidates and their targets. The approach yields precise calculations in a fraction of the time compared to previous state-of-the-art methods. The researchers say DeepBAR could one day quicken the pace of drug discovery and protein engineering.
    “Our method is orders of magnitude faster than before, meaning we can have drug discovery that is both efficient and reliable,” says Bin Zhang, the Pfizer-Laubach Career Development Professor in Chemistry at MIT, an associate member of the Broad Institute of MIT and Harvard, and a co-author of a new paper describing the technique.
    The research appears today in the Journal of Physical Chemistry Letters. The study’s lead author is Xinqiang Ding, a postdoc in MIT’s Department of Chemistry.
    The affinity between a drug molecule and a target protein is measured by a quantity called the binding free energy — the smaller the number, the stickier the bind. “A lower binding free energy means the drug can better compete against other molecules,” says Zhang, “meaning it can more effectively disrupt the protein’s normal function.” Calculating the binding free energy of a drug candidate provides an indicator of a drug’s potential effectiveness. But it’s a difficult quantity to nail down.
    Methods for computing binding free energy fall into two broad categories, each with its own drawbacks. One category calculates the quantity exactly, eating up significant time and computer resources. The second category is less computationally expensive, but it yields only an approximation of the binding free energy. Zhang and Ding devised an approach to get the best of both worlds.
    Exact and efficient
    DeepBAR computes binding free energy exactly, but it requires just a fraction of the calculations demanded by previous methods. The new technique combines traditional chemistry calculations with recent advances in machine learning.
    The “BAR” in DeepBAR stands for “Bennett acceptance ratio,” a decades-old algorithm used in exact calculations of binding free energy. Using the Bennet acceptance ratio typically requires a knowledge of two “endpoint” states (e.g., a drug molecule bound to a protein and a drug molecule completely dissociated from a protein), plus knowledge of many intermediate states (e.g., varying levels of partial binding), all of which bog down calculation speed.
    DeepBAR slashes those in-between states by deploying the Bennett acceptance ratio in machine-learning frameworks called deep generative models. “These models create a reference state for each endpoint, the bound state and the unbound state,” says Zhang. These two reference states are similar enough that the Bennett acceptance ratio can be used directly, without all the costly intermediate steps.
    In using deep generative models, the researchers were borrowing from the field of computer vision. “It’s basically the same model that people use to do computer image synthensis,” says Zhang. “We’re sort of treating each molecular structure as an image, which the model can learn. So, this project is building on the effort of the machine learning community.”
    While adapting a computer vision approach to chemistry was DeepBAR’s key innovation, the crossover also raised some challenges. “These models were originally developed for 2D images,” says Ding. “But here we have proteins and molecules — it’s really a 3D structure. So, adapting those methods in our case was the biggest technical challenge we had to overcome.”
    A faster future for drug screening
    In tests using small protein-like molecules, DeepBAR calculated binding free energy nearly 50 times faster than previous methods. Zhang says that efficiency means “we can really start to think about using this to do drug screening, in particular in the context of Covid. DeepBAR has the exact same accuracy as the gold standard, but it’s much faster.” The researchers add that, in addition to drug screening, DeepBAR could aid protein design and engineering, since the method could be used to model interactions between multiple proteins.
    DeepBAR is “a really nice computational work” with a few hurdles to clear before it can be used in real-world drug discovery, says Michael Gilson, a professor of pharmaceutical sciences at the University of California at San Diego, who was not involved in the research. He says DeepBAR would need to be validated against complex experimental data. “That will certainly pose added challenges, and it may require adding in further approximations.”
    In the future, the researchers plan to improve DeepBAR’s ability to run calculations for large proteins, a task made feasible by recent advances in computer science. “This research is an example of combining traditional computational chemistry methods, developed over decades, with the latest developments in machine learning,” says Ding. “So, we achieved something that would have been impossible before now.”
    This research was funded, in part, by the National Institutes of Health. More

  • in

    Artificial intelligence that more closely mimics the mind

    For all the progress that’s been made in the field of artificial intelligence, the world’s most flexible, efficient information processor remains the human brain. Although we can quickly make decisions based on incomplete and changing information, many of today’s artificial intelligence systems only work after being trained on well-labeled data, and when new information is available, a complete retraining is often required to incorporate it.
    Now the startup Nara Logics, co-founded by an MIT alumnus, is trying to take artificial intelligence to the next level by more closely mimicking the brain. The company’s AI engine uses recent discoveries in neuroscience to replicate brain structure and function at the circuit level.
    The result is an AI platform that holds a number of advantages over traditional neural network-based systems. While other systems use meticulously tuned, fixed algorithms, users can interact with Nara Logics’ platform, changing variables and goals to further explore their data. The platform can also begin working without labeled training data, and can incorporate new datasets as they become available. Perhaps most importantly, Nara Logics’ platform can provide the reasons behind every recommendation it makes — a key driver of adoption in sectors like health care.
    “A lot of our health care customers say they’ve had AI systems that give the likelihood of somebody being readmitted to the hospital, for example, but they’ve never had those ‘but why?’ reasons to be able to know what they can do about it,” says Nara Logics CEO Jana Eggers, who leads the company with CTO and founder Nathan Wilson PhD ’05.
    Nara Logics’ AI is currently being used by health care organizations, consumer companies, manufacturers, and the federal government to do things like lower costs and better engage with customers.
    “It’s for people whose decisions are getting complicated because there’s more factors [and data] being added, and for people that are looking at complex decisions differently because there’s novel information available,” Eggers says.
    The platform’s architecture is the result of Wilson’s decision to embrace the complexities of neuroscience rather than abstract away from them. He developed that approach over more than a decade working in MIT’s Department of Brain and Cognitive Sciences, which has long held the mission of reverse engineering the human mind.
    “At Nara Logics, we think neuroscience is on a really good track that’s going to lead to really exciting ways to make decisions that we haven’t seen before,” Wilson says.
    Following a passion
    Wilson attended Cornell University for his undergraduate and master’s degrees, but once he got to MIT in 2000, he stuck around. Over the course of a five-year PhD and a seven-year postdoc, he created mathematical frameworks to simulate brain function.
    “The community at MIT is really focused on coming up with new models of computation that go beyond what computer science offers,” Wilson says. “The work is connected with computer science, but also considers what our brain is doing that could teach us how computers work, or how computers could work.”
    On nights and weekends during the final years of his postdoc, from 2010 to 2012, Wilson was also beginning to translate his algorithms into a commercial system in work that would be the foundation of Nara Logics. In 2014, his work caught the attention of Eggers, who had led a number of successful businesses but had grown jaded about the hype around artificial intelligence.
    Eggers became convinced Nara Logics’ AI engine offered a superior way to help businesses. Even back then the engine, which the company refers to as Nara Logics Synaptic Intelligence, had properties that made it unique in the field.
    In the engine, objects in customers’ data, such as patients and treatments, organize into matrices based on features they share with other objects, in a structure similar to what has been observed in biological systems. Relationships between objects also form through a series of local functions the company calls synaptic learning rules, adapted from cell- and circuit-based neuroscience studies.
    “What we do is catalog all the metadata and what we call our Connectomes go in and mine the database of unstructured data and build links across all of it that relate these things,” Wilson explains. “Once you have that background, you can go in and say, ‘I like this, this, and this,’ and you let the engine crunch the data and give you matches to those parameters. What you didn’t have to do is have any notion of what the right answer was for lots of similar people. You skip that whole step.”
    Each object in Nara Logics’ Synaptic Intelligence stores its properties and rules locally, allowing the platform to adjust to new data by updating only a small number of associated objects. The bottom-up approach is believed to be used by the brain.
    “That’s totally different than deep learning or other approaches that just say, ‘We’re going to globally optimize everything, and each cell does what the global algorithm tells it,’” Wilson explains. “Neuroscientists are telling us each cell is making decisions on its own accord to an extent.”
    The design allows users to explore relationships in data by “activating” certain objects or features and seeing what else gets activated or suppressed.
    To give an answer, Nara Logics’ engine only activates a small number of objects in its dataset. The company says this is similar to the “sparse coding” believed to be used in higher brain regions, in which only a small number of neurons are activated in any given moment. The sparse coding principal allows the company to retrace its platform’s path and give users the reasons behind its decisions.
    As the company has matured, Wilson has stayed plugged in to the MIT community’s research, and Nara Logics participated in the STEX25 startup accelerator, run by the MIT Industrial Liaison Program, where Wilson says the company made many contacts that have turned into customers.
    Leveraging a mind-like AI
    Manufacturers are already using Nara Logics’ platform to better understand data from internet-of-things devices, consumer companies are using it to better connect with customers, and health care groups are using it to make better treatment decisions.
    “We’re focused on a specific algorithm, which is the mechanics of decision making,” Wilson says. “We believe it’s something you can codify, and we believe it’s something that’ll be insanely valuable if you can get that process right.”
    As Covid-19 disrupted industries and underscored the need for organizations to invest in adaptive software tools, Nara Logics nearly doubled its customer base. The founders are thrilled to be scaling a solution they feel is more collaborative and responsive to humans than other AI systems.
    “We think the most important difference we’re contributing to is building an AI where people participate and people are in the loop — they’re cognizant and understanding and aware of what it’s doing,” Wilson says. “That helps them make smarter decisions every day, and those add up to make a big difference.” More

  • in

    Using artificial intelligence to generate 3D holograms in real-time

    Despite years of hype, virtual reality headsets have yet to topple TV or computer screens as the go-to devices for video viewing. One reason: VR can make users feel sick. Nausea and eye strain can result because VR creates an illusion of 3D viewing although the user is in fact staring at a fixed-distance 2D display. The solution for better 3D visualization could lie in a 60-year-old technology remade for the digital world: holograms.
    Holograms deliver an exceptional representation of 3D world around us. Plus, they’re beautiful. (Go ahead — check out the holographic dove on your Visa card.) Holograms offer a shifting perspective based on the viewer’s position, and they allow the eye to adjust focal depth to alternately focus on foreground and background.
    Researchers have long sought to make computer-generated holograms, but the process has traditionally required a supercomputer to churn through physics simulations, which is time-consuming and can yield less-than-photorealistic results. Now, MIT researchers have developed a new way to produce holograms almost instantly — and the deep learning-based method is so efficient that it can run on a laptop in the blink of an eye, the researchers say.
    “People previously thought that with existing consumer-grade hardware, it was impossible to do real-time 3D holography computations,” says Liang Shi, the study’s lead author and a PhD student in MIT’s Department of Electrical Engineering and Computer Science (EECS). “It’s often been said that commercially available holographic displays will be around in 10 years, yet this statement has been around for decades.”
    Shi believes the new approach, which the team calls “tensor holography,” will finally bring that elusive 10-year goal within reach. The advance could fuel a spillover of holography into fields like VR and 3D printing.
    Shi worked on the study, published today in Nature, with his advisor and co-author Wojciech Matusik. Other co-authors include Beichen Li of EECS and the Computer Science and Artificial Intelligence Laboratory at MIT, as well as former MIT researchers Changil Kim (now at Facebook) and Petr Kellnhofer (now at Stanford University).
    The quest for better 3D
    A typical lens-based photograph encodes the brightness of each light wave — a photo can faithfully reproduce a scene’s colors, but it ultimately yields a flat image.
    In contrast, a hologram encodes both the brightness and phase of each light wave. That combination delivers a truer depiction of a scene’s parallax and depth. So, while a photograph of Monet’s “Water Lilies” can highlight the paintings’ color palate, a hologram can bring the work to life, rendering the unique 3D texture of each brush stroke. But despite their realism, holograms are a challenge to make and share.
    First developed in the mid-1900s, early holograms were recorded optically. That required splitting a laser beam, with half the beam used to illuminate the subject and the other half used as a reference for the light waves’ phase. This reference generates a hologram’s unique sense of depth.  The resulting images were static, so they couldn’t capture motion. And they were hard copy only, making them difficult to reproduce and share.
    Computer-generated holography sidesteps these challenges by simulating the optical setup. But the process can be a computational slog. “Because each point in the scene has a different depth, you can’t apply the same operations for all of them,” says Shi. “That increases the complexity significantly.” Directing a clustered supercomputer to run these physics-based simulations could take seconds or minutes for a single holographic image. Plus, existing algorithms don’t model occlusion with photorealistic precision. So Shi’s team took a different approach: letting the computer teach physics to itself.
    They used deep learning to accelerate computer-generated holography, allowing for real-time hologram generation. The team designed a convolutional neural network — a processing technique that uses a chain of trainable tensors to roughly mimic how humans process visual information. Training a neural network typically requires a large, high-quality dataset, which didn’t previously exist for 3D holograms.
    The team built a custom database of 4,000 pairs of computer-generated images. Each pair matched a picture — including color and depth information for each pixel — with its corresponding hologram. To create the holograms in the new database, the researchers used scenes with complex and variable shapes and colors, with the depth of pixels distributed evenly from the background to the foreground, and with a new set of physics-based calculations to handle occlusion. That approach resulted in photorealistic training data. Next, the algorithm got to work.
    By learning from each image pair, the tensor network tweaked the parameters of its own calculations, successively enhancing its ability to create holograms. The fully optimized network operated orders of magnitude faster than physics-based calculations. That efficiency surprised the team themselves.
    “We are amazed at how well it performs,” says Matusik. In mere milliseconds, tensor holography can craft holograms from images with depth information — which is provided by typical computer-generated images and can be calculated from a multicamera setup or LiDAR sensor (both are standard on some new smartphones). This advance paves the way for real-time 3D holography. What’s more, the compact tensor network requires less than 1 MB of memory. “It’s negligible, considering the tens and hundreds of gigabytes available on the latest cell phone,” he says.
    The research “shows that true 3D holographic displays are practical with only moderate computational requirements,” says Joel Kollin, a principal optical architect at Microsoft who was not involved with the research. He adds that “this paper shows marked improvement in image quality over previous work,” which will “add realism and comfort for the viewer.” Kollin also hints at the possibility that holographic displays like this could even be customized to a viewer’s ophthalmic prescription. “Holographic displays can correct for aberrations in the eye. This makes it possible for a display image sharper than what the user could see with contacts or glasses, which only correct for low order aberrations like focus and astigmatism.”
    “A considerable leap”
    Real-time 3D holography would enhance a slew of systems, from VR to 3D printing. The team says the new system could help immerse VR viewers in more realistic scenery, while eliminating eye strain and other side effects of long-term VR use. The technology could be easily deployed on displays that modulate the phase of light waves. Currently, most affordable consumer-grade displays modulate only brightness, though the cost of phase-modulating displays would fall if widely adopted.
    Three-dimensional holography could also boost the development of volumetric 3D printing, the researchers say. This technology could prove faster and more precise than traditional layer-by-layer 3D printing, since volumetric 3D printing allows for the simultaneous projection of the entire 3D pattern. Other applications include microscopy, visualization of medical data, and the design of surfaces with unique optical properties.
    “It’s a considerable leap that could completely change people’s attitudes toward holography,” says Matusik. “We feel like neural networks were born for this task.”
    The work was supported, in part, by Sony. More

  • in

    Algorithm helps artificial intelligence systems dodge “adversarial” inputs

    In a perfect world, what you see is what you get. If this were the case, the job of artificial intelligence systems would be refreshingly straightforward.
    Take collision avoidance systems in self-driving cars. If visual input to on-board cameras could be trusted entirely, an AI system could directly map that input to an appropriate action — steer right, steer left, or continue straight — to avoid hitting a pedestrian that its cameras see in the road.
    But what if there’s a glitch in the cameras that slightly shifts an image by a few pixels? If the car blindly trusted so-called “adversarial inputs,” it might take unnecessary and potentially dangerous action.
    A new deep-learning algorithm developed by MIT researchers is designed to help machines navigate in the real, imperfect world, by building a healthy “skepticism” of the measurements and inputs they receive.
    The team combined a reinforcement-learning algorithm with a deep neural network, both used separately to train computers in playing video games like Go and chess, to build an approach they call CARRL, for Certified Adversarial Robustness for Deep Reinforcement Learning.
    The researchers tested the approach in several scenarios, including a simulated collision-avoidance test and the video game Pong, and found that CARRL performed better — avoiding collisions and winning more Pong games — over standard machine-learning techniques, even in the face of uncertain, adversarial inputs.
    “You often think of an adversary being someone who’s hacking your computer, but it could also just be that your sensors are not great, or your measurements aren’t perfect, which is often the case,” says Michael Everett, a postdoc in MIT’s Department of Aeronautics and Astronautics (AeroAstro). “Our approach helps to account for that imperfection and make a safe decision. In any safety-critical domain, this is an important approach to be thinking about.”
    Everett is the lead author of a study outlining the new approach, which appears in IEEE’s Transactions on Neural Networks and Learning Systems. The study originated from MIT PhD student Björn Lütjens’ master’s thesis and was advised by MIT AeroAstro Professor Jonathan How.
    Possible realities
    To make AI systems robust against adversarial inputs, researchers have tried implementing defenses for supervised learning. Traditionally, a neural network is trained to associate specific labels or actions with given inputs. For instance, a neural network that is fed thousands of images labeled as cats, along with images labeled as houses and hot dogs, should correctly label a new image as a cat.
    In robust AI systems, the same supervised-learning techniques could be tested with many slightly altered versions of the image. If the network lands on the same label — cat — for every image, there’s a good chance that, altered or not, the image is indeed of a cat, and the network is robust to any adversarial influence.
    But running through every possible image alteration is computationally exhaustive and difficult to apply successfully to time-sensitive tasks such as collision avoidance. Furthermore, existing methods also don’t identify what label to use, or what action to take, if the network is less robust and labels some altered cat images as a house or a hotdog.
    “In order to use neural networks in safety-critical scenarios, we had to find out how to take real-time decisions based on worst-case assumptions on these possible realities,” Lütjens says.
    The best reward
    The team instead looked to build on reinforcement learning, another form of machine learning that does not require associating labeled inputs with outputs, but rather aims to reinforce certain actions in response to certain inputs, based on a resulting reward. This approach is typically used to train computers to play and win games such as chess and Go.
    Reinforcement learning has mostly been applied to situations where inputs are assumed to be true. Everett and his colleagues say they are the first to bring “certifiable robustness” to uncertain, adversarial inputs in reinforcement learning.
    Their approach, CARRL, uses an existing deep-reinforcement-learning algorithm to train a deep Q-network, or DQN — a neural network with multiple layers that ultimately associates an input with a Q value, or level of reward.
    The approach takes an input, such as an image with a single dot, and considers an adversarial influence, or a region around the dot where it actually might be instead. Every possible position of the dot within this region is fed through a DQN to find an associated action that would result in the most optimal worst-case reward, based on a technique developed by recent MIT graduate student Tsui-Wei “Lily” Weng PhD ’20.
    An adversarial world
    In tests with the video game Pong, in which two players operate paddles on either side of a screen to pass a ball back and forth, the researchers introduced an “adversary” that pulled the ball slightly further down than it actually was. They found that CARRL won more games than standard techniques, as the adversary’s influence grew.
    “If we know that a measurement shouldn’t be trusted exactly, and the ball could be anywhere within a certain region, then our approach tells the computer that it should put the paddle in the middle of that region, to make sure we hit the ball even in the worst-case deviation,” Everett says.
    In a game of Pong, MIT researchers show that, with perfect measurements, a standard deep learning algorithm is able to win most games (left). But in a scenario where the measurements are influenced by an “adversary” that shifts the ball’s position by a few pixels (middle), the computer easily beats the standard algorithm. The team’s new algorithm, CARRL, handles such adversarial attacks, or manipulations to measurements, winning against the computer, even though it doesn’t know exactly where the ball is. Courtesy of the researchers
    The method was similarly robust in tests of collision avoidance, where the team simulated a blue and an orange agent attempting to switch positions without colliding. As the team perturbed the orange agent’s observation of the blue agent’s position, CARRL steered the orange agent around the other agent, taking a wider berth as the adversary grew stronger, and the blue agent’s position became more uncertain.
    There did come a point when CARRL became too conservative, causing the orange agent to assume the other agent could be anywhere in its vicinity, and in response completely avoid its destination. This extreme conservatism is useful, Everett says, because researchers can then use it as a limit to tune the algorithm’s robustness. For instance, the algorithm might consider a smaller deviation, or region of uncertainty, that would still allow an agent to achieve a high reward and reach its destination.
    In addition to overcoming imperfect sensors, Everett says CARRL may be a start to helping robots safely handle unpredictable interactions in the real world.
    “People can be adversarial, like getting in front of a robot to block its sensors, or interacting with them, not necessarily with the best intentions,” Everett says. “How can a robot think of all the things people might try to do, and try to avoid them? What sort of adversarial models do we want to defend against? That’s something we’re thinking about how to do.”
    This research was supported, in part, by Ford Motor Company as part of the Ford-MIT Alliance. More