More stories

  • in

    Using generative AI to improve software testing

    Generative AI is getting plenty of attention for its ability to create text and images. But those media represent only a fraction of the data that proliferate in our society today. Data are generated every time a patient goes through a medical system, a storm impacts a flight, or a person interacts with a software application.

    Using generative AI to create realistic synthetic data around those scenarios can help organizations more effectively treat patients, reroute planes, or improve software platforms — especially in scenarios where real-world data are limited or sensitive.

    For the last three years, the MIT spinout DataCebo has offered a generative software system called the Synthetic Data Vault to help organizations create synthetic data to do things like test software applications and train machine learning models.

    The Synthetic Data Vault, or SDV, has been downloaded more than 1 million times, with more than 10,000 data scientists using the open-source library for generating synthetic tabular data. The founders — Principal Research Scientist Kalyan Veeramachaneni and alumna Neha Patki ’15, SM ’16 — believe the company’s success is due to SDV’s ability to revolutionize software testing.

    SDV goes viral

    In 2016, Veeramachaneni’s group in the Data to AI Lab unveiled a suite of open-source generative AI tools to help organizations create synthetic data that matched the statistical properties of real data.

    Companies can use synthetic data instead of sensitive information in programs while still preserving the statistical relationships between datapoints. Companies can also use synthetic data to run new software through simulations to see how it performs before releasing it to the public.

    Veeramachaneni’s group came across the problem because it was working with companies that wanted to share their data for research.

    “MIT helps you see all these different use cases,” Patki explains. “You work with finance companies and health care companies, and all those projects are useful to formulate solutions across industries.”

    In 2020, the researchers founded DataCebo to build more SDV features for larger organizations. Since then, the use cases have been as impressive as they’ve been varied.

    With DataCebo’s new flight simulator, for instance, airlines can plan for rare weather events in a way that would be impossible using only historic data. In another application, SDV users synthesized medical records to predict health outcomes for patients with cystic fibrosis. A team from Norway recently used SDV to create synthetic student data to evaluate whether various admissions policies were meritocratic and free from bias.

    In 2021, the data science platform Kaggle hosted a competition for data scientists that used SDV to create synthetic data sets to avoid using proprietary data. Roughly 30,000 data scientists participated, building solutions and predicting outcomes based on the company’s realistic data.

    And as DataCebo has grown, it’s stayed true to its MIT roots: All of the company’s current employees are MIT alumni.

    Supercharging software testing

    Although their open-source tools are being used for a variety of use cases, the company is focused on growing its traction in software testing.

    “You need data to test these software applications,” Veeramachaneni says. “Traditionally, developers manually write scripts to create synthetic data. With generative models, created using SDV, you can learn from a sample of data collected and then sample a large volume of synthetic data (which has the same properties as real data), or create specific scenarios and edge cases, and use the data to test your application.”

    For example, if a bank wanted to test a program designed to reject transfers from accounts with no money in them, it would have to simulate many accounts simultaneously transacting. Doing that with data created manually would take a lot of time. With DataCebo’s generative models, customers can create any edge case they want to test.

    “It’s common for industries to have data that is sensitive in some capacity,” Patki says. “Often when you’re in a domain with sensitive data you’re dealing with regulations, and even if there aren’t legal regulations, it’s in companies’ best interest to be diligent about who gets access to what at which time. So, synthetic data is always better from a privacy perspective.”

    Scaling synthetic data

    Veeramachaneni believes DataCebo is advancing the field of what it calls synthetic enterprise data, or data generated from user behavior on large companies’ software applications.

    “Enterprise data of this kind is complex, and there is no universal availability of it, unlike language data,” Veeramachaneni says. “When folks use our publicly available software and report back if works on a certain pattern, we learn a lot of these unique patterns, and it allows us to improve our algorithms. From one perspective, we are building a corpus of these complex patterns, which for language and images is readily available. “

    DataCebo also recently released features to improve SDV’s usefulness, including tools to assess the “realism” of the generated data, called the SDMetrics library as well as a way to compare models’ performances called SDGym.

    “It’s about ensuring organizations trust this new data,” Veeramachaneni says. “[Our tools offer] programmable synthetic data, which means we allow enterprises to insert their specific insight and intuition to build more transparent models.”

    As companies in every industry rush to adopt AI and other data science tools, DataCebo is ultimately helping them do so in a way that is more transparent and responsible.

    “In the next few years, synthetic data from generative models will transform all data work,” Veeramachaneni says. “We believe 90 percent of enterprise operations can be done with synthetic data.” More

  • in

    Dealing with the limitations of our noisy world

    Tamara Broderick first set foot on MIT’s campus when she was a high school student, as a participant in the inaugural Women’s Technology Program. The monthlong summer academic experience gives young women a hands-on introduction to engineering and computer science.

    What is the probability that she would return to MIT years later, this time as a faculty member?

    That’s a question Broderick could probably answer quantitatively using Bayesian inference, a statistical approach to probability that tries to quantify uncertainty by continuously updating one’s assumptions as new data are obtained.

    In her lab at MIT, the newly tenured associate professor in the Department of Electrical Engineering and Computer Science (EECS) uses Bayesian inference to quantify uncertainty and measure the robustness of data analysis techniques.

    “I’ve always been really interested in understanding not just ‘What do we know from data analysis,’ but ‘How well do we know it?’” says Broderick, who is also a member of the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society. “The reality is that we live in a noisy world, and we can’t always get exactly the data that we want. How do we learn from data but at the same time recognize that there are limitations and deal appropriately with them?”

    Broadly, her focus is on helping people understand the confines of the statistical tools available to them and, sometimes, working with them to craft better tools for a particular situation.

    For instance, her group recently collaborated with oceanographers to develop a machine-learning model that can make more accurate predictions about ocean currents. In another project, she and others worked with degenerative disease specialists on a tool that helps severely motor-impaired individuals utilize a computer’s graphical user interface by manipulating a single switch.

    A common thread woven through her work is an emphasis on collaboration.

    “Working in data analysis, you get to hang out in everybody’s backyard, so to speak. You really can’t get bored because you can always be learning about some other field and thinking about how we can apply machine learning there,” she says.

    Hanging out in many academic “backyards” is especially appealing to Broderick, who struggled even from a young age to narrow down her interests.

    A math mindset

    Growing up in a suburb of Cleveland, Ohio, Broderick had an interest in math for as long as she can remember. She recalls being fascinated by the idea of what would happen if you kept adding a number to itself, starting with 1+1=2 and then 2+2=4.

    “I was maybe 5 years old, so I didn’t know what ‘powers of two’ were or anything like that. I was just really into math,” she says.

    Her father recognized her interest in the subject and enrolled her in a Johns Hopkins program called the Center for Talented Youth, which gave Broderick the opportunity to take three-week summer classes on a range of subjects, from astronomy to number theory to computer science.

    Later, in high school, she conducted astrophysics research with a postdoc at Case Western University. In the summer of 2002, she spent four weeks at MIT as a member of the first class of the Women’s Technology Program.

    She especially enjoyed the freedom offered by the program, and its focus on using intuition and ingenuity to achieve high-level goals. For instance, the cohort was tasked with building a device with LEGOs that they could use to biopsy a grape suspended in Jell-O.

    The program showed her how much creativity is involved in engineering and computer science, and piqued her interest in pursuing an academic career.

    “But when I got into college at Princeton, I could not decide — math, physics, computer science — they all seemed super-cool. I wanted to do all of it,” she says.

    She settled on pursuing an undergraduate math degree but took all the physics and computer science courses she could cram into her schedule.

    Digging into data analysis

    After receiving a Marshall Scholarship, Broderick spent two years at Cambridge University in the United Kingdom, earning a master of advanced study in mathematics and a master of philosophy in physics.

    In the UK, she took a number of statistics and data analysis classes, including her first class on Bayesian data analysis in the field of machine learning.

    It was a transformative experience, she recalls.

    “During my time in the U.K., I realized that I really like solving real-world problems that matter to people, and Bayesian inference was being used in some of the most important problems out there,” she says.

    Back in the U.S., Broderick headed to the University of California at Berkeley, where she joined the lab of Professor Michael I. Jordan as a grad student. She earned a PhD in statistics with a focus on Bayesian data analysis. 

    She decided to pursue a career in academia and was drawn to MIT by the collaborative nature of the EECS department and by how passionate and friendly her would-be colleagues were.

    Her first impressions panned out, and Broderick says she has found a community at MIT that helps her be creative and explore hard, impactful problems with wide-ranging applications.

    “I’ve been lucky to work with a really amazing set of students and postdocs in my lab — brilliant and hard-working people whose hearts are in the right place,” she says.

    One of her team’s recent projects involves a collaboration with an economist who studies the use of microcredit, or the lending of small amounts of money at very low interest rates, in impoverished areas.

    The goal of microcredit programs is to raise people out of poverty. Economists run randomized control trials of villages in a region that receive or don’t receive microcredit. They want to generalize the study results, predicting the expected outcome if one applies microcredit to other villages outside of their study.

    But Broderick and her collaborators have found that results of some microcredit studies can be very brittle. Removing one or a few data points from the dataset can completely change the results. One issue is that researchers often use empirical averages, where a few very high or low data points can skew the results.

    Using machine learning, she and her collaborators developed a method that can determine how many data points must be dropped to change the substantive conclusion of the study. With their tool, a scientist can see how brittle the results are.

    “Sometimes dropping a very small fraction of data can change the major results of a data analysis, and then we might worry how far those conclusions generalize to new scenarios. Are there ways we can flag that for people? That is what we are getting at with this work,” she explains.

    At the same time, she is continuing to collaborate with researchers in a range of fields, such as genetics, to understand the pros and cons of different machine-learning techniques and other data analysis tools.

    Happy trails

    Exploration is what drives Broderick as a researcher, and it also fuels one of her passions outside the lab. She and her husband enjoy collecting patches they earn by hiking all the trails in a park or trail system.

    “I think my hobby really combines my interests of being outdoors and spreadsheets,” she says. “With these hiking patches, you have to explore everything and then you see areas you wouldn’t normally see. It is adventurous, in that way.”

    They’ve discovered some amazing hikes they would never have known about, but also embarked on more than a few “total disaster hikes,” she says. But each hike, whether a hidden gem or an overgrown mess, offers its own rewards.

    And just like in her research, curiosity, open-mindedness, and a passion for problem-solving have never led her astray. More

  • in

    Startup accelerates progress toward light-speed computing

    Our ability to cram ever-smaller transistors onto a chip has enabled today’s age of ubiquitous computing. But that approach is finally running into limits, with some experts declaring an end to Moore’s Law and a related principle, known as Dennard’s Scaling.

    Those developments couldn’t be coming at a worse time. Demand for computing power has skyrocketed in recent years thanks in large part to the rise of artificial intelligence, and it shows no signs of slowing down.

    Now Lightmatter, a company founded by three MIT alumni, is continuing the remarkable progress of computing by rethinking the lifeblood of the chip. Instead of relying solely on electricity, the company also uses light for data processing and transport. The company’s first two products, a chip specializing in artificial intelligence operations and an interconnect that facilitates data transfer between chips, use both photons and electrons to drive more efficient operations.

    “The two problems we are solving are ‘How do chips talk?’ and ‘How do you do these [AI] calculations?’” Lightmatter co-founder and CEO Nicholas Harris PhD ’17 says. “With our first two products, Envise and Passage, we’re addressing both of those questions.”

    In a nod to the size of the problem and the demand for AI, Lightmatter raised just north of $300 million in 2023 at a valuation of $1.2 billion. Now the company is demonstrating its technology with some of the largest technology companies in the world in hopes of reducing the massive energy demand of data centers and AI models.

    “We’re going to enable platforms on top of our interconnect technology that are made up of hundreds of thousands of next-generation compute units,” Harris says. “That simply wouldn’t be possible without the technology that we’re building.”

    From idea to $100K

    Prior to MIT, Harris worked at the semiconductor company Micron Technology, where he studied the fundamental devices behind integrated chips. The experience made him see how the traditional approach for improving computer performance — cramming more transistors onto each chip — was hitting its limits.

    “I saw how the roadmap for computing was slowing, and I wanted to figure out how I could continue it,” Harris says. “What approaches can augment computers? Quantum computing and photonics were two of those pathways.”

    Harris came to MIT to work on photonic quantum computing for his PhD under Dirk Englund, an associate professor in the Department of Electrical Engineering and Computer Science. As part of that work, he built silicon-based integrated photonic chips that could send and process information using light instead of electricity.

    The work led to dozens of patents and more than 80 research papers in prestigious journals like Nature. But another technology also caught Harris’s attention at MIT.

    “I remember walking down the hall and seeing students just piling out of these auditorium-sized classrooms, watching relayed live videos of lectures to see professors teach deep learning,” Harris recalls, referring to the artificial intelligence technique. “Everybody on campus knew that deep learning was going to be a huge deal, so I started learning more about it, and we realized that the systems I was building for photonic quantum computing could actually be leveraged to do deep learning.”

    Harris had planned to become a professor after his PhD, but he realized he could attract more funding and innovate more quickly through a startup, so he teamed up with Darius Bunandar PhD ’18, who was also studying in Englund’s lab, and Thomas Graham MBA ’18. The co-founders successfully launched into the startup world by winning the 2017 MIT $100K Entrepreneurship Competition.

    Seeing the light

    Lightmatter’s Envise chip takes the part of computing that electrons do well, like memory, and combines it with what light does well, like performing the massive matrix multiplications of deep-learning models.

    “With photonics, you can perform multiple calculations at the same time because the data is coming in on different colors of light,” Harris explains. “In one color, you could have a photo of a dog. In another color, you could have a photo of a cat. In another color, maybe a tree, and you could have all three of those operations going through the same optical computing unit, this matrix accelerator, at the same time. That drives up operations per area, and it reuses the hardware that’s there, driving up energy efficiency.”

    Passage takes advantage of light’s latency and bandwidth advantages to link processors in a manner similar to how fiber optic cables use light to send data over long distances. It also enables chips as big as entire wafers to act as a single processor. Sending information between chips is central to running the massive server farms that power cloud computing and run AI systems like ChatGPT.

    Both products are designed to bring energy efficiencies to computing, which Harris says are needed to keep up with rising demand without bringing huge increases in power consumption.

    “By 2040, some predict that around 80 percent of all energy usage on the planet will be devoted to data centers and computing, and AI is going to be a huge fraction of that,” Harris says. “When you look at computing deployments for training these large AI models, they’re headed toward using hundreds of megawatts. Their power usage is on the scale of cities.”

    Lightmatter is currently working with chipmakers and cloud service providers for mass deployment. Harris notes that because the company’s equipment runs on silicon, it can be produced by existing semiconductor fabrication facilities without massive changes in process.

    The ambitious plans are designed to open up a new path forward for computing that would have huge implications for the environment and economy.

    “We’re going to continue looking at all of the pieces of computers to figure out where light can accelerate them, make them more energy efficient, and faster, and we’re going to continue to replace those parts,” Harris says. “Right now, we’re focused on interconnect with Passage and on compute with Envise. But over time, we’re going to build out the next generation of computers, and it’s all going to be centered around light.” More

  • in

    New AI model could streamline operations in a robotic warehouse

    Hundreds of robots zip back and forth across the floor of a colossal robotic warehouse, grabbing items and delivering them to human workers for packing and shipping. Such warehouses are increasingly becoming part of the supply chain in many industries, from e-commerce to automotive production.

    However, getting 800 robots to and from their destinations efficiently while keeping them from crashing into each other is no easy task. It is such a complex problem that even the best path-finding algorithms struggle to keep up with the breakneck pace of e-commerce or manufacturing. 

    In a sense, these robots are like cars trying to navigate a crowded city center. So, a group of MIT researchers who use AI to mitigate traffic congestion applied ideas from that domain to tackle this problem.

    They built a deep-learning model that encodes important information about the warehouse, including the robots, planned paths, tasks, and obstacles, and uses it to predict the best areas of the warehouse to decongest to improve overall efficiency.

    Their technique divides the warehouse robots into groups, so these smaller groups of robots can be decongested faster with traditional algorithms used to coordinate robots. In the end, their method decongests the robots nearly four times faster than a strong random search method.

    In addition to streamlining warehouse operations, this deep learning approach could be used in other complex planning tasks, like computer chip design or pipe routing in large buildings.

    “We devised a new neural network architecture that is actually suitable for real-time operations at the scale and complexity of these warehouses. It can encode hundreds of robots in terms of their trajectories, origins, destinations, and relationships with other robots, and it can do this in an efficient manner that reuses computation across groups of robots,” says Cathy Wu, the Gilbert W. Winslow Career Development Assistant Professor in Civil and Environmental Engineering (CEE), and a member of a member of the Laboratory for Information and Decision Systems (LIDS) and the Institute for Data, Systems, and Society (IDSS).

    Wu, senior author of a paper on this technique, is joined by lead author Zhongxia Yan, a graduate student in electrical engineering and computer science. The work will be presented at the International Conference on Learning Representations.

    Robotic Tetris

    From a bird’s eye view, the floor of a robotic e-commerce warehouse looks a bit like a fast-paced game of “Tetris.”

    When a customer order comes in, a robot travels to an area of the warehouse, grabs the shelf that holds the requested item, and delivers it to a human operator who picks and packs the item. Hundreds of robots do this simultaneously, and if two robots’ paths conflict as they cross the massive warehouse, they might crash.

    Traditional search-based algorithms avoid potential crashes by keeping one robot on its course and replanning a trajectory for the other. But with so many robots and potential collisions, the problem quickly grows exponentially.

    “Because the warehouse is operating online, the robots are replanned about every 100 milliseconds. That means that every second, a robot is replanned 10 times. So, these operations need to be very fast,” Wu says.

    Because time is so critical during replanning, the MIT researchers use machine learning to focus the replanning on the most actionable areas of congestion — where there exists the most potential to reduce the total travel time of robots.

    Wu and Yan built a neural network architecture that considers smaller groups of robots at the same time. For instance, in a warehouse with 800 robots, the network might cut the warehouse floor into smaller groups that contain 40 robots each.

    Then, it predicts which group has the most potential to improve the overall solution if a search-based solver were used to coordinate trajectories of robots in that group.

    An iterative process, the overall algorithm picks the most promising robot group with the neural network, decongests the group with the search-based solver, then picks the next most promising group with the neural network, and so on.

    Considering relationships

    The neural network can reason about groups of robots efficiently because it captures complicated relationships that exist between individual robots. For example, even though one robot may be far away from another initially, their paths could still cross during their trips.

    The technique also streamlines computation by encoding constraints only once, rather than repeating the process for each subproblem. For instance, in a warehouse with 800 robots, decongesting a group of 40 robots requires holding the other 760 robots as constraints. Other approaches require reasoning about all 800 robots once per group in each iteration.

    Instead, the researchers’ approach only requires reasoning about the 800 robots once across all groups in each iteration.

    “The warehouse is one big setting, so a lot of these robot groups will have some shared aspects of the larger problem. We designed our architecture to make use of this common information,” she adds.

    They tested their technique in several simulated environments, including some set up like warehouses, some with random obstacles, and even maze-like settings that emulate building interiors.

    By identifying more effective groups to decongest, their learning-based approach decongests the warehouse up to four times faster than strong, non-learning-based approaches. Even when they factored in the additional computational overhead of running the neural network, their approach still solved the problem 3.5 times faster.

    In the future, the researchers want to derive simple, rule-based insights from their neural model, since the decisions of the neural network can be opaque and difficult to interpret. Simpler, rule-based methods could also be easier to implement and maintain in actual robotic warehouse settings.

    “This approach is based on a novel architecture where convolution and attention mechanisms interact effectively and efficiently. Impressively, this leads to being able to take into account the spatiotemporal component of the constructed paths without the need of problem-specific feature engineering. The results are outstanding: Not only is it possible to improve on state-of-the-art large neighborhood search methods in terms of quality of the solution and speed, but the model generalizes to unseen cases wonderfully,” says Andrea Lodi, the Andrew H. and Ann R. Tisch Professor at Cornell Tech, and who was not involved with this research.

    This work was supported by Amazon and the MIT Amazon Science Hub. More

  • in

    “We offer another place for knowledge”

    In the Dzaleka Refugee Camp in Malawi, Jospin Hassan didn’t have access to the education opportunities he sought. So, he decided to create his own. 

    Hassan knew the booming fields of data science and artificial intelligence could bring job opportunities to his community and help solve local challenges. After earning a spot in the 2020-21 cohort of the Certificate Program in Computer and Data Science from MIT Refugee Action Hub (ReACT), Hassan started sharing MIT knowledge and skills with other motivated learners in Dzaleka.

    MIT ReACT is now Emerging Talent, part of the Jameel World Education Lab (J-WEL) at MIT Open Learning. Currently serving its fifth cohort of global learners, Emerging Talent’s year-long certificate program incorporates high-quality computer science and data analysis coursework from MITx, professional skill building, experiential learning, apprenticeship work, and opportunities for networking with MIT’s global community of innovators. Hassan’s cohort honed their leadership skills through interactive online workshops with J-WEL and the 10-week online MIT Innovation Leadership Bootcamp. 

    “My biggest takeaway was networking, collaboration, and learning from each other,” Hassan says.

    Today, Hassan’s organization ADAI Circle offers mentorship and education programs for youth and other job seekers in the Dzaleka Refugee Camp. The curriculum encourages hands-on learning and collaboration.

    Launched in 2020, ADAI Circle aims to foster job creation and reduce poverty in Malawi through technology and innovation. In addition to their classes in data science, AI, software development, and hardware design, their Innovation Hub offers internet access to anyone in need. 

    Doing something different in the community

    Hassan first had the idea for his organization in 2018 when he reached a barrier in his own education journey. There were several programs in the Dzaleka Refugee Camp teaching learners how to code websites and mobile apps, but Hassan felt that they were limited in scope. 

    “We had good devices and internet access,” he says, “but I wanted to learn something new.” 

    Teaming up with co-founder Patrick Byamasu, Hassan and Byamasu set their sights on the longevity of AI and how that might create more jobs for people in their community. “The world is changing every day, and data scientists are in a higher demand today in various companies,” Hassan says. “For this reason, I decided to expand and share the knowledge that I acquired with my fellow refugees and the surrounding villages.”

    ADAI Circle draws inspiration from Hassan’s own experience with MIT Emerging Talent coursework, community, and training opportunities. For example, the MIT Bootcamps model is now standard practice for ADAI Circle’s annual hackathon. Hassan first introduced the hackathon to ADAI Circle students as part of his final experiential learning project of the Emerging Talent certificate program. 

    ADAI Circle’s annual hackathon is now an interactive — and effective — way to select students who will most benefit from its programs. The local schools’ curricula, Hassan says, might not provide enough of an academic challenge. “We can’t teach everyone and accommodate everyone because there are a lot of schools,” Hassan says, “but we offer another place for knowledge.” 

    The hackathon helps students develop data science and robotics skills. Before they start coding, students have to convince ADAI Circle teachers that their designs are viable, answering questions like, “What problem are you solving?” and “How will this help the community?” A community-oriented mindset is just as important to the curriculum.

    In addition to the practical skills Hassan gained from Emerging Talent, he leveraged the program’s network to help his community. Thanks to a social media connection Hassan made with the nongovernmental organization Give Internet after one of Emerging Talent’s virtual events, Give Internet brought internet access to ADAI Circle.

    Bridging the AI gap to unmet communities

    In 2023, ADAI Circle connected with another MIT Open Learning program, Responsible AI for Social Empowerment and Education (RAISE), which led to a pilot test of a project-based AI curriculum for middle school students. The Responsible AI for Computational Action (RAICA) curriculum equipped ADAI Circle students with AI skills for chatbots and natural language processing. 

    “I liked that program because it was based on what we’re teaching at the center,” Hassan says, speaking of his organization’s mission of bridging the AI gap to reach unmet communities.

    The RAICA curriculum was designed by education experts at MIT Scheller Teacher Education Program (STEP Lab) and AI experts from MIT Personal Robots group and MIT App Inventor. ADAI Circle teachers gave detailed feedback about the pilot to the RAICA team. During weekly meetings with Glenda Stump, education research scientist for RAICA and J-WEL, and Angela Daniel, teacher development specialist for RAICA, the teachers discussed their experiences, prepared for upcoming lessons, and translated the learning materials in real time. 

    “We are trying to create a curriculum that’s accessible worldwide and to students who typically have little or no access to technology,” says Mary Cate Gustafson-Quiett, curriculum design manager at STEP Lab and project manager for RAICA. “Working with ADAI and students in a refugee camp challenged us to design in more culturally and technologically inclusive ways.”

    Gustafson-Quiett says the curriculum feedback from ADAI Circle helped inform how RAICA delivers teacher development resources to accommodate learning environments with limited internet access. “They also exposed places where our team’s western ideals, specifically around individualism, crept into activities in the lesson and contrasted with their more communal cultural beliefs,” she says.

    Eager to introduce more MIT-developed AI resources, Hassan also shared MIT RAISE’s Day of AI curricula with ADAI Circle teachers. The new ChatGPT module gave students the chance to level up their chatbot programming skills that they gained from the RAICA module. Some of the advanced students are taking initiative to use ChatGPT API to create their own projects in education.

    “We don’t want to tell them what to do, we want them to come up with their own ideas,” Hassan says.

    Although ADAI Circle faces many challenges, Hassan says his team is addressing them one by one. Last year, they didn’t have electricity in their Innovation Hub, but they solved that. This year, they achieved a stable internet connection that’s one of the fastest in Malawi. Next up, they are hoping to secure more devices for their students, create more jobs, and add additional hubs throughout the community. The work is never done, but Hassan is starting to see the impact that ADAI Circle is making. 

    “For those who want to learn data science, let’s let them learn,” Hassan says. More

  • in

    Putting AI into the hands of people with problems to solve

    As Media Lab students in 2010, Karthik Dinakar SM ’12, PhD ’17 and Birago Jones SM ’12 teamed up for a class project to build a tool that would help content moderation teams at companies like Twitter (now X) and YouTube. The project generated a huge amount of excitement, and the researchers were invited to give a demonstration at a cyberbullying summit at the White House — they just had to get the thing working.

    The day before the White House event, Dinakar spent hours trying to put together a working demo that could identify concerning posts on Twitter. Around 11 p.m., he called Jones to say he was giving up.

    Then Jones decided to look at the data. It turned out Dinakar’s model was flagging the right types of posts, but the posters were using teenage slang terms and other indirect language that Dinakar didn’t pick up on. The problem wasn’t the model; it was the disconnect between Dinakar and the teens he was trying to help.

    “We realized then, right before we got to the White House, that the people building these models should not be folks who are just machine-learning engineers,” Dinakar says. “They should be people who best understand their data.”

    The insight led the researchers to develop point-and-click tools that allow nonexperts to build machine-learning models. Those tools became the basis for Pienso, which today is helping people build large language models for detecting misinformation, human trafficking, weapons sales, and more, without writing any code.

    “These kinds of applications are important to us because our roots are in cyberbullying and understanding how to use AI for things that really help humanity,” says Jones.

    As for the early version of the system shown at the White House, the founders ended up collaborating with students at nearby schools in Cambridge, Massachusetts, to let them train the models.

    “The models those kids trained were so much better and nuanced than anything I could’ve ever come up with,” Dinakar says. “Birago and I had this big ‘Aha!’ moment where we realized empowering domain experts — which is different from democratizing AI — was the best path forward.”

    A project with purpose

    Jones and Dinakar met as graduate students in the Software Agents research group of the MIT Media Lab. Their work on what became Pienso started in Course 6.864 (Natural Language Processing) and continued until they earned their master’s degrees in 2012.

    It turned out 2010 wasn’t the last time the founders were invited to the White House to demo their project. The work generated a lot of enthusiasm, but the founders worked on Pienso part time until 2016, when Dinakar finished his PhD at MIT and deep learning began to explode in popularity.

    “We’re still connected to many people around campus,” Dinakar says. “The exposure we had at MIT, the melding of human and computer interfaces, widened our understanding. Our philosophy at Pienso couldn’t be possible without the vibrancy of MIT’s campus.”

    The founders also credit MIT’s Industrial Liaison Program (ILP) and Startup Accelerator (STEX) for connecting them to early partners.

    One early partner was SkyUK. The company’s customer success team used Pienso to build models to understand their customer’s most common problems. Today those models are helping to process half a million customer calls a day, and the founders say they have saved the company over £7 million pounds to date by shortening the length of calls into the company’s call center.

    “The difference between democratizing AI and empowering people with AI comes down to who understands the data best — you or a doctor or a journalist or someone who works with customers every day?” Jones says. “Those are the people who should be creating the models. That’s how you get insights out of your data.”

    In 2020, just as Covid-19 outbreaks began in the U.S., government officials contacted the founders to use their tool to better understand the emerging disease. Pienso helped experts in virology and infectious disease set up machine-learning models to mine thousands of research articles about coronaviruses. Dinakar says they later learned the work helped the government identify and strengthen critical supply chains for drugs, including the popular antiviral remdesivir.

    “Those compounds were surfaced by a team that did not know deep learning but was able to use our platform,” Dinakar says.

    Building a better AI future

    Because Pienso can run on internal servers and cloud infrastructure, the founders say it offers an alternative for businesses being forced to donate their data by using services offered by other AI companies.

    “The Pienso interface is a series of web apps stitched together,” Dinakar explains. “You can think of it like an Adobe Photoshop for large language models, but in the web. You can point and import data without writing a line of code. You can refine the data, prepare it for deep learning, analyze it, give it structure if it’s not labeled or annotated, and you can walk away with fine-tuned, large language model in a matter of 25 minutes.”

    Earlier this year, Pienso announced a partnership with GraphCore, which provides a faster, more efficient computing platform for machine learning. The founders say the partnership will further lower barriers to leveraging AI by dramatically reducing latency.

    “If you’re building an interactive AI platform, users aren’t going to have a cup of coffee every time they click a button,” Dinakar says. “It needs to be fast and responsive.”

    The founders believe their solution is enabling a future where more effective AI models are developed for specific use cases by the people who are most familiar with the problems they are trying to solve.

    “No one model can do everything,” Dinakar says. “Everyone’s application is different, their needs are different, their data is different. It’s highly unlikely that one model will do everything for you. It’s about bringing a garden of models together and allowing them to collaborate with each other and orchestrating them in a way that makes sense — and the people doing that orchestration should be the people who understand the data best.” More

  • in

    MIT researchers remotely map crops, field by field

    Crop maps help scientists and policymakers track global food supplies and estimate how they might shift with climate change and growing populations. But getting accurate maps of the types of crops that are grown from farm to farm often requires on-the-ground surveys that only a handful of countries have the resources to maintain.

    Now, MIT engineers have developed a method to quickly and accurately label and map crop types without requiring in-person assessments of every single farm. The team’s method uses a combination of Google Street View images, machine learning, and satellite data to automatically determine the crops grown throughout a region, from one fraction of an acre to the next. 

    The researchers used the technique to automatically generate the first nationwide crop map of Thailand — a smallholder country where small, independent farms make up the predominant form of agriculture. The team created a border-to-border map of Thailand’s four major crops — rice, cassava, sugarcane, and maize — and determined which of the four types was grown, at every 10 meters, and without gaps, across the entire country. The resulting map achieved an accuracy of 93 percent, which the researchers say is comparable to on-the-ground mapping efforts in high-income, big-farm countries.

    The team is applying their mapping technique to other countries such as India, where small farms sustain most of the population but the type of crops grown from farm to farm has historically been poorly recorded.

    “It’s a longstanding gap in knowledge about what is grown around the world,” says Sherrie Wang, the d’Arbeloff Career Development Assistant Professor in MIT’s Department of Mechanical Engineering, and the Institute for Data, Systems, and Society (IDSS). “The final goal is to understand agricultural outcomes like yield, and how to farm more sustainably. One of the key preliminary steps is to map what is even being grown — the more granularly you can map, the more questions you can answer.”

    Wang, along with MIT graduate student Jordi Laguarta Soler and Thomas Friedel of the agtech company PEAT GmbH, will present a paper detailing their mapping method later this month at the AAAI Conference on Artificial Intelligence.

    Ground truth

    Smallholder farms are often run by a single family or farmer, who subsist on the crops and livestock that they raise. It’s estimated that smallholder farms support two-thirds of the world’s rural population and produce 80 percent of the world’s food. Keeping tabs on what is grown and where is essential to tracking and forecasting food supplies around the world. But the majority of these small farms are in low to middle-income countries, where few resources are devoted to keeping track of individual farms’ crop types and yields.

    Crop mapping efforts are mainly carried out in high-income regions such as the United States and Europe, where government agricultural agencies oversee crop surveys and send assessors to farms to label crops from field to field. These “ground truth” labels are then fed into machine-learning models that make connections between the ground labels of actual crops and satellite signals of the same fields. They then label and map wider swaths of farmland that assessors don’t cover but that satellites automatically do.

    “What’s lacking in low- and middle-income countries is this ground label that we can associate with satellite signals,” Laguarta Soler says. “Getting these ground truths to train a model in the first place has been limited in most of the world.”

    The team realized that, while many developing countries do not have the resources to maintain crop surveys, they could potentially use another source of ground data: roadside imagery, captured by services such as Google Street View and Mapillary, which send cars throughout a region to take continuous 360-degree images with dashcams and rooftop cameras.

    In recent years, such services have been able to access low- and middle-income countries. While the goal of these services is not specifically to capture images of crops, the MIT team saw that they could search the roadside images to identify crops.

    Cropped image

    In their new study, the researchers worked with Google Street View (GSV) images taken throughout Thailand — a country that the service has recently imaged fairly thoroughly, and which consists predominantly of smallholder farms.

    Starting with over 200,000 GSV images randomly sampled across Thailand, the team filtered out images that depicted buildings, trees, and general vegetation. About 81,000 images were crop-related. They set aside 2,000 of these, which they sent to an agronomist, who determined and labeled each crop type by eye. They then trained a convolutional neural network to automatically generate crop labels for the other 79,000 images, using various training methods, including iNaturalist — a web-based crowdsourced  biodiversity database, and GPT-4V, a “multimodal large language model” that enables a user to input an image and ask the model to identify what the image is depicting. For each of the 81,000 images, the model generated a label of one of four crops that the image was likely depicting — rice, maize, sugarcane, or cassava.

    The researchers then paired each labeled image with the corresponding satellite data taken of the same location throughout a single growing season. These satellite data include measurements across multiple wavelengths, such as a location’s greenness and its reflectivity (which can be a sign of water). 

    “Each type of crop has a certain signature across these different bands, which changes throughout a growing season,” Laguarta Soler notes.

    The team trained a second model to make associations between a location’s satellite data and its corresponding crop label. They then used this model to process satellite data taken of the rest of the country, where crop labels were not generated or available. From the associations that the model learned, it then assigned crop labels across Thailand, generating a country-wide map of crop types, at a resolution of 10 square meters.

    This first-of-its-kind crop map included locations corresponding to the 2,000 GSV images that the researchers originally set aside, that were labeled by arborists. These human-labeled images were used to validate the map’s labels, and when the team looked to see whether the map’s labels matched the expert, “gold standard” labels, it did so 93 percent of the time.

    “In the U.S., we’re also looking at over 90 percent accuracy, whereas with previous work in India, we’ve only seen 75 percent because ground labels are limited,” Wang says. “Now we can create these labels in a cheap and automated way.”

    The researchers are moving to map crops across India, where roadside images via Google Street View and other services have recently become available.

    “There are over 150 million smallholder farmers in India,” Wang says. “India is covered in agriculture, almost wall-to-wall farms, but very small farms, and historically it’s been very difficult to create maps of India because there are very sparse ground labels.”

    The team is working to generate crop maps in India, which could be used to inform policies having to do with assessing and bolstering yields, as global temperatures and populations rise.

    “What would be interesting would be to create these maps over time,” Wang says. “Then you could start to see trends, and we can try to relate those things to anything like changes in climate and policies.” More

  • in

    Six MIT students selected as spring 2024 MIT-Pillar AI Collective Fellows

    The MIT-Pillar AI Collective has announced six fellows for the spring 2024 semester. With support from the program, the graduate students, who are in their final year of a master’s or PhD program, will conduct research in the areas of AI, machine learning, and data science with the aim of commercializing their innovations.

    Launched by MIT’s School of Engineering and Pillar VC in 2022, the MIT-Pillar AI Collective supports faculty, postdocs, and students conducting research on AI, machine learning, and data science. Supported by a gift from Pillar VC and administered by the MIT Deshpande Center for Technological Innovation, the mission of the program is to advance research toward commercialization.

    The spring 2024 MIT-Pillar AI Collective Fellows are:

    Yasmeen AlFaraj

    Yasmeen AlFaraj is a PhD candidate in chemistry whose interest is in the application of data science and machine learning to soft materials design to enable next-generation, sustainable plastics, rubber, and composite materials. More specifically, she is applying machine learning to the design of novel molecular additives to enable the low-cost manufacturing of chemically deconstructable thermosets and composites. AlFaraj’s work has led to the discovery of scalable, translatable new materials that could address thermoset plastic waste. As a Pillar Fellow, she will pursue bringing this technology to market, initially focusing on wind turbine blade manufacturing and conformal coatings. Through the Deshpande Center for Technological Innovation, AlFaraj serves as a lead for a team developing a spinout focused on recyclable versions of existing high-performance thermosets by incorporating small quantities of a degradable co-monomer. In addition, she participated in the National Science Foundation Innovation Corps program and recently graduated from the Clean Tech Open, where she focused on enhancing her business plan, analyzing potential markets, ensuring a complete IP portfolio, and connecting with potential funders. AlFaraj earned a BS in chemistry from University of California at Berkeley.

    Ruben Castro Ornelas

    Ruben Castro Ornelas is a PhD student in mechanical engineering who is passionate about the future of multipurpose robots and designing the hardware to use them with AI control solutions. Combining his expertise in programming, embedded systems, machine design, reinforcement learning, and AI, he designed a dexterous robotic hand capable of carrying out useful everyday tasks without sacrificing size, durability, complexity, or simulatability. Ornelas’s innovative design holds significant commercial potential in domestic, industrial, and health-care applications because it could be adapted to hold everything from kitchenware to delicate objects. As a Pillar Fellow, he will focus on identifying potential commercial markets, determining the optimal approach for business-to-business sales, and identifying critical advisors. Ornelas served as co-director of StartLabs, an undergraduate entrepreneurship club at MIT, where he earned an BS in mechanical engineering.

    Keeley Erhardt

    Keeley Erhardt is a PhD candidate in media arts and sciences whose research interests lie in the transformative potential of AI in network analysis, particularly for entity correlation and hidden link detection within and across domains. She has designed machine learning algorithms to identify and track temporal correlations and hidden signals in large-scale networks, uncovering online influence campaigns originating from multiple countries. She has similarly demonstrated the use of graph neural networks to identify coordinated cryptocurrency accounts by analyzing financial time series data and transaction dynamics. As a Pillar Fellow, Erhardt will pursue the potential commercial applications of her work, such as detecting fraud, propaganda, money laundering, and other covert activity in the finance, energy, and national security sectors. She has had internships at Google, Facebook, and Apple and held software engineering roles at multiple tech unicorns. Erhardt earned an MEng in electrical engineering and computer science and a BS in computer science, both from MIT.

    Vineet Jagadeesan Nair

    Vineet Jagadeesan Nair is a PhD candidate in mechanical engineering whose research focuses on modeling power grids and designing electricity markets to integrate renewables, batteries, and electric vehicles. He is broadly interested in developing computational tools to tackle climate change. As a Pillar Fellow, Nair will explore the application of machine learning and data science to power systems. Specifically, he will experiment with approaches to improve the accuracy of forecasting electricity demand and supply with high spatial-temporal resolution. In collaboration with Project Tapestry @ Google X, he is also working on fusing physics-informed machine learning with conventional numerical methods to increase the speed and accuracy of high-fidelity simulations. Nair’s work could help realize future grids with high penetrations of renewables and other clean, distributed energy resources. Outside academics, Nair is active in entrepreneurship, most recently helping to organize the 2023 MIT Global Startup Workshop in Greece. He earned an MS in computational science and engineering from MIT, an MPhil in energy technologies from Cambridge University as a Gates Scholar, and a BS in mechanical engineering and a BA in economics from University of California at Berkeley.

    Mahdi Ramadan

    Mahdi Ramadan is a PhD candidate in brain and cognitive sciences whose research interests lie at the intersection of cognitive science, computational modeling, and neural technologies. His work uses novel unsupervised methods for learning and generating interpretable representations of neural dynamics, capitalizing on recent advances in AI, specifically contrastive and geometric deep learning techniques capable of uncovering the latent dynamics underlying neural processes with high fidelity. As a Pillar Fellow, he will leverage these methods to gain a better understanding of dynamical models of muscle signals for generative motor control. By supplementing current spinal prosthetics with generative AI motor models that can streamline, speed up, and correct limb muscle activations in real time, as well as potentially using multimodal vision-language models to infer the patients’ high-level intentions, Ramadan aspires to build truly scalable, accessible, and capable commercial neuroprosthetics. Ramadan’s entrepreneurial experience includes being the co-founder of UltraNeuro, a neurotechnology startup, and co-founder of Presizely, a computer vision startup. He earned a BS in neurobiology from University of Washington.

    Rui (Raymond) Zhou

    Rui (Raymond) Zhou is a PhD candidate in mechanical engineering whose research focuses on multimodal AI for engineering design. As a Pillar Fellow, he will advance models that could enable designers to translate information in any modality or combination of modalities into comprehensive 2D and 3D designs, including parametric data, component visuals, assembly graphs, and sketches. These models could also optimize existing human designs to accomplish goals such as improving ergonomics or reducing drag coefficient. Ultimately, Zhou aims to translate his work into a software-as-a-service platform that redefines product design across various sectors, from automotive to consumer electronics. His efforts have the potential to not only accelerate the design process but also reduce costs, opening the door to unprecedented levels of customization, idea generation, and rapid prototyping. Beyond his academic pursuits, Zhou founded UrsaTech, a startup that integrates AI into education and engineering design. He earned a BS in electrical engineering and computer sciences from University of California at Berkeley. More