More stories

  • in

    Success at the intersection of technology and finance

    Citadel founder and CEO Ken Griffin had some free advice for an at-capacity crowd of MIT students at the Wong Auditorium during a campus visit in April. “If you find yourself in a career where you’re not learning,” he told them, “it’s time to change jobs. In this world, if you’re not learning, you can find yourself irrelevant in the blink of an eye.”

    During a conversation with Bryan Landman ’11, senior quantitative research lead for Citadel’s Global Quantitative Strategies business, Griffin reflected on his career and offered predictions for the impact of technology on the finance sector. Citadel, which he launched in 1990, is now one of the world’s leading investment firms. Griffin also serves as non-executive chair of Citadel Securities, a market maker that is known as a key player in the modernization of markets and market structures.

    “We are excited to hear Ken share his perspective on how technology continues to shape the future of finance, including the emerging trends of quantum computing and AI,” said David Schmittlein, the John C Head III Dean and professor of marketing at MIT Sloan School of Management, who kicked off the program. The presentation was jointly sponsored by MIT Sloan, the MIT Schwarzman College of Computing, the School of Engineering, MIT Career Advising and Professional Development, and Citadel Securities Campus Recruiting.

    The future, in Griffin’s view, “is all about the application of engineering, software, and mathematics to markets. Successful entrepreneurs are those who have the tools to solve the unsolved problems of that moment in time.” He launched Citadel only one year after graduating from college. “History so far has been kind to the vision I had back in the late ’80s,” he said.

    Griffin realized very early in his career “that you could use a personal computer and quantitative finance to price traded securities in a way that was much more advanced than you saw on your typical equity trading desk on Wall Street.” Both businesses, he told the audience, are ultimately driven by research. “That’s where we formulate the ideas, and trading is how we monetize that research.”

    It’s also why Citadel and Citadel Securities employ several hundred software engineers. “We have a huge investment today in using modern technology to power our decision-making and trading,” said Griffin.

    One example of Citadel’s application of technology and science is the firm’s hiring of a meteorological team to expand the weather analytics expertise within its commodities business. While power supply is relatively easy to map and analyze, predicting demand is much more difficult. Citadel’s weather team feeds forecast data obtained from supercomputers to its traders. “Wind and solar are huge commodities,” Griffin explained, noting that the days with highest demand in the power market are cloudy, cold days with no wind. When you can forecast those days better than the market as a whole, that’s where you can identify opportunities, he added.

    Pros and cons of machine learning

    Asking about the impact of new technology on their sector, Landman noted that both Citadel and Citadel Securities are already leveraging machine learning. “In the market-making business,” Griffin said, “you see a real application for machine learning because you have so much data to parametrize the models with. But when you get into longer time horizon problems, machine learning starts to break down.”

    Griffin noted that the data obtained through machine learning is most helpful for investments with short time horizons, such as in its quantitative strategies business. “In our fundamental equities business,” he said, “machine learning is not as helpful as you would want because the underlying systems are not stationary.”

    Griffin was emphatic that “there has been a moment in time where being a really good statistician or really understanding machine-learning models was sufficient to make money. That won’t be the case for much longer.” One of the guiding principles at Citadel, he and Landman agreed, was that machine learning and other methodologies should not be used blindly. Each analyst has to cite the underlying economic theory driving their argument on investment decisions. “If you understand the problem in a different way than people who are just using the statistical models,” he said, “you have a real chance for a competitive advantage.”

    ChatGPT and a seismic shift

    Asked if ChatGPT will change history, Griffin predicted that the rise of capabilities in large language models will transform a substantial number of white collar jobs. “With open AI for most routine commercial legal documents, ChatGPT will do a better job writing a lease than a young lawyer. This is the first time we are seeing traditionally white-collar jobs at risk due to technology, and that’s a sea change.”

    Griffin urged MIT students to work with the smartest people they can find, as he did: “The magic of Citadel has been a testament to the idea that by surrounding yourself with bright, ambitious people, you can accomplish something special. I went to great lengths to hire the brightest people I could find and gave them responsibility and trust early in their careers.”

    Even more critical to success is the willingness to advocate for oneself, Griffin said, using Gerald Beeson, Citadel’s chief operating officer, as an example. Beeson, who started as an intern at the firm, “consistently sought more responsibility and had the foresight to train his own successors.” Urging students to take ownership of their careers, Griffin advised: “Make it clear that you’re willing to take on more responsibility, and think about what the roadblocks will be.”

    When microphones were handed to the audience, students inquired what changes Griffin would like to see in the hedge fund industry, how Citadel assesses the risk and reward of potential projects, and whether hedge funds should give back to the open source community. Asked about the role that Citadel — and its CEO — should play in “the wider society,” Griffin spoke enthusiastically of his belief in participatory democracy. “We need better people on both sides of the aisle,” he said. “I encourage all my colleagues to be politically active. It’s unfortunate when firms shut down political dialogue; we actually embrace it.”

    Closing on an optimistic note, Griffin urged the students in the audience to go after success, declaring, “The world is always awash in challenge and its shortcomings, but no matter what anybody says, you live at the greatest moment in the history of the planet. Make the most of it.” More

  • in

    Festival of Learning 2023 underscores importance of well-designed learning environments

    During its first in-person gathering since 2020, MIT’s Festival of Learning 2023 explored how the learning sciences can inform the Institute on how to best support students. Co-sponsored by MIT Open Learning and the Office of the Vice Chancellor (OVC), this annual event celebrates teaching and learning innovations with MIT instructors, students, and staff.

    Bror Saxberg SM ’85, PhD ’89, founder of LearningForge LLC and former chief learning officer at Kaplan, Inc., was invited as keynote speaker, with opening remarks by MIT Chancellor Melissa Nobles and Vice President for Open Learning Eric Grimson, and discussion moderated by Senior Associate Dean of Open Learning Christopher Capozzola. This year’s festival focused on how creating well-designed learning environments using learning engineering can increase learning success.

    Play video

    2023 Festival of Learning: Highlights

    Well-designed learning environments are key

    In his keynote speech “Learning Engineering: What We Know, What We Can Do,” Saxberg defined “learning engineering” as the practical application of learning sciences to real-world problems at scale. He said, “High levels can be reached by all learners, given access to well-designed instruction and motivation for enough practice opportunities.”

    Informed by decades of empirical evidence from the field of learning science, Saxberg’s own research, and insights from Kaplan, Inc., Saxberg finds that a hands-on strategy he calls “prepare, practice, perform” delivers better learning outcomes than a traditional “read, write, discuss” approach. Saxberg recommends educators devote at least 60 percent of learning time to hands-on approaches, such as producing, creating, and engaging. Only 20-30 percent of learning time should be spent in the more passive “knowledge acquisition” modes of listening and reading.

    “Here at MIT, a place that relies on data to make informed decisions, learning engineering can provide a framework for us to center in on the learner to identify the challenges associated with learning, and to apply the learning sciences in data-driven ways to improve instructional approaches,” said Nobles. During their opening remarks, Nobles and Grimson both emphasized how learning engineering at MIT is informed by the Institute’s commitment to educating the whole student, which encompasses student well-being and belonging in addition to academic rigor. “What lessons can we take away to change the way we think about education moving forward? This is a chance to iterate,” said Grimson.

    Well-designed learning environments are informed by understanding motivation, considering the connection between long-term and working memory, identifying the range of learners’ prior experience, grounding practice in authentic contexts (i.e., work environments), and using data-driven instructional approaches to iterate and improve.

    Play video

    2023 Festival of Learning: Keynote by Bror Saxberg

    Understand learner motivation

    Saxberg asserted that before developing course structures and teaching approaches known to encourage learning, educators must first examine learner motivation. Motivation doesn’t require enjoyment of the subject or task to spur engagement. Similar to how a well-designed physical training program can change your muscle cells, if a learner starts, persists, and exerts mental effort in a well-designed learning environment, they can change their neurons — they learn. Saxberg described four main barriers to learner motivation, and solutions for each:

    The learner doesn’t see the value of the lesson. Ways to address this include helping the learners find value; leveraging the learner’s expertise in another area to better understand the topic at hand; and making the activity itself enjoyable. “Finding value” could be as simple as explaining the practical applications of this knowledge in their future work in the field, or how this lesson prepares learners for their advanced level courses. 
    Self-efficacy for learners who don’t think they’re capable. Educators can point to parallel experiences with similar goals that students may have already achieved in another context. Alternatively, educators can share stories of professionals who have successfully transitioned from one area of expertise to another. 
    “Something” in the learner’s way, such as not having the time, space, or correct materials. This is an opportunity to demonstrate how a learner can use problem-solving skills to find a solution to their perceived problem. As with the barrier of self-efficacy, educators can assure learners that they are in control of the situation by sharing similar stories of those who’ve encountered the same problem and the solution they devised.
    The learner’s emotional state. This is no small barrier to motivation. If a learner is angry, depressed, scared, or grieving, it will be challenging for them to switch their mindset into learning mode. A wide array of emotions require a wide array of possible solutions, from structured conversation techniques to recommending professional help.
    Consider the cognitive load

    Saxberg has found that learning occurs when we use working memory to problem-solve, but our working memory can only process three to five verbal or conscious thoughts at a time. Long-term memory stores knowledge that can be accessed non-verbally and non-consciously, which is why experts appear to remember information effortlessly. Until a learner develops that expertise, extraneous information in a lesson will occupy space in their working memory, running the risk of distracting the learner from the desired learning outcome.

    To accommodate learners’ finite cognitive load, Saxberg suggested the solution of reevaluating which material is essential, then simplifying the exercise or removing unnecessary material accordingly. “That notion of, ‘what do we really need students to be able to do?’ helps you focus,” said Saxberg.

    Another solution is to leverage the knowledge, skills, and interests learners already bring to the course — these long-term memories can scaffold the new material. “What do you have in your head already, what do you love, what’s easy to draw from long-term memory? That would be the starting point for challenging new skills. It’s not the ending point because you want to use your new skills to then find out new things,” Saxberg said. Finally, consider how your course engages with the syllabi. Do you explain the reasoning behind the course structure? Do you show how the exercises or material will be applied to future courses or the field? Do you share best practices for engaging working memory and learning? By acknowledging and empathizing with the practical challenges that learners face, you can remove a barrier from their cognitive load.

    Ground practice in authentic contexts

    Saxberg stated that few experts read textbooks to learn new information — they discover what they need to know while working in the field, using those relevant facts in context. As such, students will have an easier time remembering facts if they’re practicing in relevant or similar environments to their future work.

    If students can practice classifying problems in real work contexts rather than theoretical practice problems, they can build a framework to classify what’s important. That helps students recognize the type of problem they’re trying to solve before trying to solve the problem itself. With enough hands-on practice and examples of how experts use processes and identify which principles are relevant, learners can holistically learn entire procedures. And that learning continues once learners graduate to the workforce: professionals often meet to exchange knowledge at conferences, charrettes, and other gatherings.

    Enhancing teaching at MIT

    The Festival of Learning furthers the Office of the Chancellor’s mission to advance academic innovation that will foster the growth of MIT students. The festival also aligns with the MIT Open Learning’s Residential Education team’s goal of making MIT education more effective and efficient. Throughout the year, their team offers continuous support to MIT faculty and instructors using digital technologies to augment and transform how they teach.

    “We are doubling down on our commitment to continuous growth in how we teach,” said Nobles. More

  • in

    MIT community members elected to the National Academy of Engineering for 2023

    Seven MIT researchers are among the 106 new members and 18 international members elected to the National Academy of Engineering (NAE) this week. Fourteen additional MIT alumni, including one member of the MIT Corporation, were also elected as new members.

    One of the highest professional distinctions for engineers, membership to the NAE is given to individuals who have made outstanding contributions to “engineering research, practice, or education, including, where appropriate, significant contributions to the engineering literature” and to “the pioneering of new and developing fields of technology, making major advancements in traditional fields of engineering, or developing/implementing innovative approaches to engineering education.”

    The seven MIT researchers elected this year include:

    Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in the Department of Electrical Engineering and Computer Science, principal investigator at the Computer Science and Artificial Intelligence Laboratory, and faculty lead for the MIT Abdul Latif Jameel Clinic for Machine Learning in Health, for machine learning models that understand structures in text, molecules, and medical images.

    Markus J. Buehler, the Jerry McAfee (1940) Professor in Engineering from the Department of Civil and Environmental Engineering, for implementing the use of nanomechanics to model and design fracture-resistant bioinspired materials.

    Elfatih A.B. Eltahir SM ’93, ScD ’93, the H.M. King Bhumibol Professor in the Department of Civil and Environmental Engineering, for advancing understanding of how climate and land use impact water availability, environmental and human health, and vector-borne diseases.

    Neil Gershenfeld, director of the Center for Bits and Atoms, for eliminating boundaries between digital and physical worlds, from quantum computing to digital materials to the internet of things.

    Roger D. Kamm SM ’73, PhD ’77, the Cecil and Ida Green Distinguished Professor of Biological and Mechanical Engineering, for contributions to the understanding of mechanics in biology and medicine, and leadership in biomechanics.

    David W. Miller ’82, SM ’85, ScD ’88, the Jerome C. Hunsaker Professor in the Department of Aeronautics and Astronautics, for contributions in control technology for space-based telescope design, and leadership in cross-agency guidance of space technology.

    David Simchi-Levi, professor of civil and environmental engineering, core faculty member in the Institute for Data, Systems, and Society, and principal investigator at the Laboratory for Information and Decision Systems, for contributions using optimization and stochastic modeling to enhance supply chain management and operations.

    Fariborz Maseeh ScD ’90, life member of the MIT Corporation and member of the School of Engineering Dean’s Advisory Council, was also elected as a member for leadership and advances in efficient design, development, and manufacturing of microelectromechanical systems, and for empowering engineering talent through public service.

    Thirteen additional alumni were elected to the National Academy of Engineering this year. They are: Mark George Allen SM ’86, PhD ’89; Shorya Awtar ScD ’04; Inderjit Chopra ScD ’77; David Huang ’85, SM ’89, PhD ’93; Eva Lerner-Lam SM ’78; David F. Merrion SM ’59; Virginia Norwood ’47; Martin Gerard Plys ’80, SM ’81, ScD ’84; Mark Prausnitz PhD ’94; Anil Kumar Sachdev ScD ’77; Christopher Scholz PhD ’67; Melody Ann Swartz PhD ’98; and Elias Towe ’80, SM ’81, PhD ’87.

    “I am delighted that seven members of MIT’s faculty and many members of the wider MIT community were elected to the National Academy of Engineering this year,” says Anantha Chandrakasan, the dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “My warmest congratulations on this recognition of their many contributions to engineering research and education.”

    Including this year’s inductees, 156 members of the National Academy of Engineering are current or retired members of the MIT faculty and staff, or members of the MIT Corporation. More

  • in

    Helping companies deploy AI models more responsibly

    Companies today are incorporating artificial intelligence into every corner of their business. The trend is expected to continue until machine-learning models are incorporated into most of the products and services we interact with every day.

    As those models become a bigger part of our lives, ensuring their integrity becomes more important. That’s the mission of Verta, a startup that spun out of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).

    Verta’s platform helps companies deploy, monitor, and manage machine-learning models safely and at scale. Data scientists and engineers can use Verta’s tools to track different versions of models, audit them for bias, test them before deployment, and monitor their performance in the real world.

    “Everything we do is to enable more products to be built with AI, and to do that safely,” Verta founder and CEO Manasi Vartak SM ’14, PhD ’18 says. “We’re already seeing with ChatGPT how AI can be used to generate data, artefacts — you name it — that look correct but aren’t correct. There needs to be more governance and control in how AI is being used, particularly for enterprises providing AI solutions.”

    Verta is currently working with large companies in health care, finance, and insurance to help them understand and audit their models’ recommendations and predictions. It’s also working with a number of high-growth tech companies looking to speed up deployment of new, AI-enabled solutions while ensuring those solutions are used appropriately.

    Vartak says the company has been able to decrease the time it takes customers to deploy AI models by orders of magnitude while ensuring those models are explainable and fair — an especially important factor for companies in highly regulated industries.

    Health care companies, for example, can use Verta to improve AI-powered patient monitoring and treatment recommendations. Such systems need to be thoroughly vetted for errors and biases before they’re used on patients.

    “Whether it’s bias or fairness or explainability, it goes back to our philosophy on model governance and management,” Vartak says. “We think of it like a preflight checklist: Before an airplane takes off, there’s a set of checks you need to do before you get your airplane off the ground. It’s similar with AI models. You need to make sure you’ve done your bias checks, you need to make sure there’s some level of explainability, you need to make sure your model is reproducible. We help with all of that.”

    From project to product

    Before coming to MIT, Vartak worked as a data scientist for a social media company. In one project, after spending weeks tuning machine-learning models that curated content to show in people’s feeds, she learned an ex-employee had already done the same thing. Unfortunately, there was no record of what they did or how it affected the models.

    For her PhD at MIT, Vartak decided to build tools to help data scientists develop, test, and iterate on machine-learning models. Working in CSAIL’s Database Group, Vartak recruited a team of graduate students and participants in MIT’s Undergraduate Research Opportunities Program (UROP).

    “Verta would not exist without my work at MIT and MIT’s ecosystem,” Vartak says. “MIT brings together people on the cutting edge of tech and helps us build the next generation of tools.”

    The team worked with data scientists in the CSAIL Alliances program to decide what features to build and iterated based on feedback from those early adopters. Vartak says the resulting project, named ModelDB, was the first open-source model management system.

    Vartak also took several business classes at the MIT Sloan School of Management during her PhD and worked with classmates on startups that recommended clothing and tracked health, spending countless hours in the Martin Trust Center for MIT Entrepreneurship and participating in the center’s delta v summer accelerator.

    “What MIT lets you do is take risks and fail in a safe environment,” Vartak says. “MIT afforded me those forays into entrepreneurship and showed me how to go about building products and finding first customers, so by the time Verta came around I had done it on a smaller scale.”

    ModelDB helped data scientists train and track models, but Vartak quickly saw the stakes were higher once models were deployed at scale. At that point, trying to improve (or accidentally breaking) models can have major implications for companies and society. That insight led Vartak to begin building Verta.

    “At Verta, we help manage models, help run models, and make sure they’re working as expected, which we call model monitoring,” Vartak explains. “All of those pieces have their roots back to MIT and my thesis work. Verta really evolved from my PhD project at MIT.”

    Verta’s platform helps companies deploy models more quickly, ensure they continue working as intended over time, and manage the models for compliance and governance. Data scientists can use Verta to track different versions of models and understand how they were built, answering questions like how data were used and which explainability or bias checks were run. They can also vet them by running them through deployment checklists and security scans.

    “Verta’s platform takes the data science model and adds half a dozen layers to it to transform it into something you can use to power, say, an entire recommendation system on your website,” Vartak says. “That includes performance optimizations, scaling, and cycle time, which is how quickly you can take a model and turn it into a valuable product, as well as governance.”

    Supporting the AI wave

    Vartak says large companies often use thousands of different models that influence nearly every part of their operations.

    “An insurance company, for example, will use models for everything from underwriting to claims, back-office processing, marketing, and sales,” Vartak says. “So, the diversity of models is really high, there’s a large volume of them, and the level of scrutiny and compliance companies need around these models are very high. They need to know things like: Did you use the data you were supposed to use? Who were the people who vetted it? Did you run explainability checks? Did you run bias checks?”

    Vartak says companies that don’t adopt AI will be left behind. The companies that ride AI to success, meanwhile, will need well-defined processes in place to manage their ever-growing list of models.

    “In the next 10 years, every device we interact with is going to have intelligence built in, whether it’s a toaster or your email programs, and it’s going to make your life much, much easier,” Vartak says. “What’s going to enable that intelligence are better models and software, like Verta, that help you integrate AI into all of these applications very quickly.” More

  • in

    New program to support translational research in AI, data science, and machine learning

    The MIT School of Engineering and Pillar VC today announced the MIT-Pillar AI Collective, a one-year pilot program funded by a gift from Pillar VC that will provide seed grants for projects in artificial intelligence, machine learning, and data science with the goal of supporting translational research. The program will support graduate students and postdocs through access to funding, mentorship, and customer discovery.

    Administered by the MIT Deshpande Center for Technological Innovation, the MIT-Pillar AI Collective will center on the market discovery process, advancing projects through market research, customer discovery, and prototyping. Graduate students and postdocs will aim to emerge from the program having built minimum viable products, with support from Pillar VC and experienced industry leaders.

    “We are grateful for this support from Pillar VC and to join forces to converge the commercialization of translational research in AI, data science, and machine learning, with an emphasis on identifying and cultivating prospective entrepreneurs,” says Anantha Chandrakasan, dean of the MIT School of Engineering and Vannevar Bush Professor of Electrical Engineering and Computer Science. “Pillar’s focus on mentorship for our graduate students and postdoctoral researchers, and centering the program within the Deshpande Center, will undoubtedly foster big ideas in AI and create an environment for prospective companies to launch and thrive.” 

    Founded by Jamie Goldstein ’89, Pillar VC is committed to growing companies and investing in personal and professional development, coaching, and community.

    “Many of the most promising companies of the future are living at MIT in the form of transformational research in the fields of data science, AI, and machine learning,” says Goldstein. “We’re honored by the chance to help unlock this potential and catalyze a new generation of founders by surrounding students and postdoctoral researchers with the resources and mentorship they need to move from the lab to industry.”

    The program will launch with the 2022-23 academic year. Grants will be open only to MIT faculty and students, with an emphasis on funding for graduate students in their final year, as well as postdocs. Applications must be submitted by MIT employees with principal investigator status. A selection committee composed of three MIT representatives will include Devavrat Shah, faculty director of the Deshpande Center, the Andrew (1956) and Erna Viterbi Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society; the chair of the selection committee; and a representative from the MIT Schwarzman College of Computing. The committee will also include representation from Pillar VC. Funding will be provided for up to nine research teams.

    “The Deshpande Center will serve as the perfect home for the new collective, given its focus on moving innovative technologies from the lab to the marketplace in the form of breakthrough products and new companies,” adds Chandrakasan. 

    “The Deshpande Center has a 20-year history of guiding new technologies toward commercialization, where they can have a greater impact,” says Shah. “This new collective will help the center expand its own impact by helping more projects realize their market potential and providing more support to researchers in the fast-growing fields of AI, machine learning, and data science.” More

  • in

    Transforming the travel experience for the Hong Kong airport

    MIT Hong Kong Innovation Node welcomed 33 students to its flagship program, MIT Entrepreneurship and Maker Skills Integrator (MEMSI). Designed to develop entrepreneurial prowess through exposure to industry-driven challenges, MIT students joined forces with Hong Kong peers in this two-week hybrid bootcamp, developing unique proposals for the Airport Authority of Hong Kong.

    Many airports across the world continue to be affected by the broader impact of Covid-19 with reduced air travel, prompting airlines to cut capacity. The result is a need for new business opportunities to propel economic development. For Hong Kong, the expansion toward non-aeronautical activities to boost regional consumption is therefore crucial, and included as part of the blueprint to transform the city’s airport into an airport city — characterized by capacity expansion, commercial developments, air cargo leadership, an autonomous transport system, connectivity to neighboring cities in mainland China, and evolution into a smart airport guided by sustainable practices. To enhance the customer experience, a key focus is capturing business opportunities at the nexus of digital and physical interactions. 

    These challenges “bring ideas and talent together to tackle real-world problems in the areas of digital service creation for the airport and engaging regional customers to experience the new airport city,” says Charles Sodini, the LeBel Professor of Electrical Engineering at MIT and faculty director at the Node. 

    The new travel standard

    Businesses are exploring new digital technologies, both to drive bookings and to facilitate safe travel. Developments such as Hong Kong airport’s Flight Token, a biometric technology using facial recognition to enable contactless check-ins and boarding at airports, unlock enormous potential that speeds up the departure journey of passengers. Seamless virtual experiences are not going to disappear.

    “What we may see could be a strong rebounce especially for travelers after the travel ban lifts … an opportunity to make travel easier, flying as simple as riding the bus,” says Chris Au Young, general manager of smart airport and general manager of data analytics at the Airport Authority of Hong Kong. 

    The passenger experience of the future will be “enabled by mobile technology, internet of things, and digital platforms,” he explains, adding that in the aviation community, “international organizations have already stipulated that biometric technology will be the new standard for the future … the next question is how this can be connected across airports.”  

    This extends further beyond travel, where Au Young illustrates, “If you go to a concert at Asia World Expo, which is the airport’s new arena in the future, you might just simply show your face rather than queue up in a long line waiting to show your tickets.”

    Accelerating the learning curve with industry support

    Working closely with industry mentors involved in the airport city’s development, students dived deep into discussions on the future of adapted travel, interviewed and surveyed travelers, and plowed through a range of airport data to uncover business insights.

    “With the large amount of data provided, my teammates and I worked hard to identify modeling opportunities that were both theoretically feasible and valuable in a business sense,” says Sean Mann, a junior at MIT studying computer science.

    Mann and his team applied geolocation data to inform machine learning predictions on a passenger’s journey once they enter the airside area. Coupled with biometric technology, passengers can receive personalized recommendations with improved accuracy via the airport’s bespoke passenger app, powered by data collected through thousands of iBeacons dispersed across the vicinity. Armed with these insights, the aim is to enhance the user experience by driving meaningful footfall to retail shops, restaurants, and other airport amenities.

    The support of industry partners inspired his team “with their deep understanding of the aviation industry,” he added. “In a short period of two weeks, we built a proof-of-concept and a rudimentary business plan — the latter of which was very new to me.”

    Collaborating across time zones, Rumen Dangovski, a PhD candidate in electrical engineering and computer science at MIT, joined MEMSI from his home in Bulgaria. For him, learning “how to continually revisit ideas to discover important problems and meaningful solutions for a large and complex real-world system” was a key takeaway. The iterative process helped his team overcome the obstacle of narrowing down the scope of their proposal, with the help of industry mentors and advisors. 

    “Without the feedback from industry partners, we would not have been able to formulate a concrete solution that is actually helpful to the airport,” says Dangovski.  

    Beyond valuable mentorship, he adds, “there was incredible energy in our team, consisting of diverse talent, grit, discipline and organization. I was positively surprised how MEMSI can form quickly and give continual support to our team. The overall experience was very fun.“

    A sustainable future

    Mrigi Munjal, a PhD candidate studying materials science and engineering at MIT, had just taken a long-haul flight from Boston to Delhi prior to the program, and “was beginning to fully appreciate the scale of carbon emissions from aviation.” For her, “that one journey basically overshadowed all of my conscious pro-sustainability lifestyle changes,” she says.

    Knowing that international flights constitute the largest part of an individual’s carbon footprint, Munjal and her team wanted “to make flying more sustainable with an idea that is economically viable for all of the stakeholders involved.” 

    They proposed a carbon offset API that integrates into an airline’s ticket payment system, empowering individuals to take action to offset their carbon footprint, track their personal carbon history, and pick and monitor green projects. The advocacy extends to a digital display of interactive art featured in physical installations across the airport city. The intent is to raise community awareness about one’s impact on the environment and making carbon offsetting accessible. 

    Shaping the travel narrative

    Six teams of students created innovative solutions for the Hong Kong airport which they presented in hybrid format to a panel of judges on Showcase Day. The diverse ideas included an app-based airport retail recommendations supported by iBeacons; a platform that empowers customers to offset their carbon footprint; an app that connects fellow travelers for social and incentive-driven retail experiences; a travel membership exchange platform offering added flexibility to earn and redeem loyalty rewards; an interactive and gamified location-based retail experience using augmented reality; and a digital companion avatar to increase adoption of the airport’s Flight Token and improve airside passenger experience.

    Among the judges was Julian Lee ’97, former president of the MIT Club of Hong Kong and current executive director of finance at the Airport Authority of Hong Kong, who commended the students for demonstrably having “worked very thoroughly and thinking through the specific challenges,” addressing the real pain points that the airport is experiencing.

    “The ideas were very thoughtful and very unique to us. Some of you defined transit passengers as a sub-segment of the market that works. It only happens at the airport and you’ve been able to leverage this transit time in between,” remarked Lee. 

    Strong solutions include an implementation plan to see a path for execution and a viable future. Among the solutions proposed, Au Young was impressed by teams for “paying a lot of attention to the business model … a very important aspect in all the ideas generated.”  

    Addressing the students, Au Young says, “What we love is the way you reinvent the airport business and partnerships, presenting a new way of attracting people to engage more in new services and experiences — not just returning for a flight or just shopping with us, but innovating beyond the airport and using emerging technologies, using location data, using the retailer’s capability and adding some social activities in your solutions.”

    Despite today’s rapidly evolving travel industry, what remains unchanged is a focus on the customer. In the end, “it’s still about the passengers,” added Au Young.  More

  • in

    Deep-learning technique predicts clinical treatment outcomes

    When it comes to treatment strategies for critically ill patients, clinicians want to be able to consider all their options and timing of administration, and make the optimal decision for their patients. While clinician experience and study has helped them to be successful in this effort, not all patients are the same, and treatment decisions at this crucial time could mean the difference between patient improvement and quick deterioration. Therefore, it would be helpful for doctors to be able to take a patient’s previous known health status and received treatments and use that to predict that patient’s health outcome under different treatment scenarios, in order to pick the best path.

    Now, a deep-learning technique, called G-Net, from researchers at MIT and IBM provides a window into causal counterfactual prediction, affording physicians the opportunity to explore how a patient might fare under different treatment plans. The foundation of G-Net is the g-computation algorithm, a causal inference method that estimates the effect of dynamic exposures in the presence of measured confounding variables — ones that may influence both treatments and outcomes. Unlike previous implementations of the g-computation framework, which have used linear modeling approaches, G-Net uses recurrent neural networks (RNN), which have node connections that allow them to better model temporal sequences with complex and nonlinear dynamics, like those found in the physiological and clinical time series data. In this way, physicians can develop alternative plans based on patient history and test them before making a decision.

    “Our ultimate goal is to develop a machine learning technique that would allow doctors to explore various ‘What if’ scenarios and treatment options,” says Li-wei Lehman, MIT research scientist in the MIT Institute for Medical Engineering and Science and an MIT-IBM Watson AI Lab project lead. “A lot of work has been done in terms of deep learning for counterfactual prediction but [it’s] been focusing on a point exposure setting,” or a static, time-varying treatment strategy, which doesn’t allow for adjustment of treatments as patient history changes. However, her team’s new prediction approach provides for treatment plan flexibility and chances for treatment alteration over time as patient covariate history and past treatments change. “G-Net is the first deep-learning approach based on g-computation that can predict both the population-level and individual-level treatment effects under dynamic and time varying treatment strategies.”

    The research, which was recently published in the Proceedings of Machine Learning Research, was co-authored by Rui Li MEng ’20, Stephanie Hu MEng ’21, former MIT postdoc Mingyu Lu MD, graduate student Yuria Utsumi, IBM research staff member Prithwish Chakraborty, IBM Research director of Hybrid Cloud Services Daby Sow, IBM data scientist Piyush Madan, IBM research scientist Mohamed Ghalwash, and IBM research scientist Zach Shahn.

    Tracking disease progression

    To build, validate, and test G-Net’s predictive abilities, the researchers considered the circulatory system in septic patients in the ICU. During critical care, doctors need to make trade-offs and judgement calls, such as ensuring the organs are receiving adequate blood supply without overworking the heart. For this, they could give intravenous fluids to patients to increase blood pressure; however, too much can cause edema. Alternatively, physicians can administer vasopressors, which act to contract blood vessels and raise blood pressure.

    In order to mimic this and demonstrate G-Net’s proof-of-concept, the team used CVSim, a mechanistic model of a human cardiovascular system that’s governed by 28 input variables characterizing the system’s current state, such as arterial pressure, central venous pressure, total blood volume, and total peripheral resistance, and modified it to simulate various disease processes (e.g., sepsis or blood loss) and effects of interventions (e.g., fluids and vasopressors). The researchers used CVSim to generate observational patient data for training and for “ground truth” comparison against counterfactual prediction. In their G-Net architecture, the researchers ran two RNNs to handle and predict variables that are continuous, meaning they can take on a range of values, like blood pressure, and categorical variables, which have discrete values, like the presence or absence of pulmonary edema. The researchers simulated the health trajectories of thousands of “patients” exhibiting symptoms under one treatment regime, let’s say A, for 66 timesteps, and used them to train and validate their model.

    Testing G-Net’s prediction capability, the team generated two counterfactual datasets. Each contained roughly 1,000 known patient health trajectories, which were created from CVSim using the same “patient” condition as the starting point under treatment A. Then at timestep 33, treatment changed to plan B or C, depending on the dataset. The team then performed 100 prediction trajectories for each of these 1,000 patients, whose treatment and medical history was known up until timestep 33 when a new treatment was administered. In these cases, the prediction agreed well with the “ground-truth” observations for individual patients and averaged population-level trajectories.

    A cut above the rest

    Since the g-computation framework is flexible, the researchers wanted to examine G-Net’s prediction using different nonlinear models — in this case, long short-term memory (LSTM) models, which are a type of RNN that can learn from previous data patterns or sequences — against the more classical linear models and a multilayer perception model (MLP), a type of neural network that can make predictions using a nonlinear approach. Following a similar setup as before, the team found that the error between the known and predicted cases was smallest in the LSTM models compared to the others. Since G-Net is able to model the temporal patterns of the patient’s ICU history and past treatment, whereas a linear model and MLP cannot, it was better able to predict the patient’s outcome.

    The team also compared G-Net’s prediction in a static, time-varying treatment setting against two state-of-the-art deep-learning based counterfactual prediction approaches, a recurrent marginal structural network (rMSN) and a counterfactual recurrent neural network (CRN), as well as a linear model and an MLP. For this, they investigated a model for tumor growth under no treatment, radiation, chemotherapy, and both radiation and chemotherapy scenarios. “Imagine a scenario where there’s a patient with cancer, and an example of a static regime would be if you only give a fixed dosage of chemotherapy, radiation, or any kind of drug, and wait until the end of your trajectory,” comments Lu. For these investigations, the researchers generated simulated observational data using tumor volume as the primary influence dictating treatment plans and demonstrated that G-Net outperformed the other models. One potential reason could be because g-computation is known to be more statistically efficient than rMSN and CRN, when models are correctly specified.

    While G-Net has done well with simulated data, more needs to be done before it can be applied to real patients. Since neural networks can be thought of as “black boxes” for prediction results, the researchers are beginning to investigate the uncertainty in the model to help ensure safety. In contrast to these approaches that recommend an “optimal” treatment plan without any clinician involvement, “as a decision support tool, I believe that G-Net would be more interpretable, since the clinicians would input treatment strategies themselves,” says Lehman, and “G-Net will allow them to be able to explore different hypotheses.” Further, the team has moved on to using real data from ICU patients with sepsis, bringing it one step closer to implementation in hospitals.

    “I think it is pretty important and exciting for real-world applications,” says Hu. “It’d be helpful to have some way to predict whether or not a treatment might work or what the effects might be — a quicker iteration process for developing these hypotheses for what to try, before actually trying to implement them in in a years-long, potentially very involved and very invasive type of clinical trial.”

    This research was funded by the MIT-IBM Watson AI Lab. More

  • in

    The downside of machine learning in health care

    While working toward her dissertation in computer science at MIT, Marzyeh Ghassemi wrote several papers on how machine-learning techniques from artificial intelligence could be applied to clinical data in order to predict patient outcomes. “It wasn’t until the end of my PhD work that one of my committee members asked: ‘Did you ever check to see how well your model worked across different groups of people?’”

    That question was eye-opening for Ghassemi, who had previously assessed the performance of models in aggregate, across all patients. Upon a closer look, she saw that models often worked differently — specifically worse — for populations including Black women, a revelation that took her by surprise. “I hadn’t made the connection beforehand that health disparities would translate directly to model disparities,” she says. “And given that I am a visible minority woman-identifying computer scientist at MIT, I am reasonably certain that many others weren’t aware of this either.”

    In a paper published Jan. 14 in the journal Patterns, Ghassemi — who earned her doctorate in 2017 and is now an assistant professor in the Department of Electrical Engineering and Computer Science and the MIT Institute for Medical Engineering and Science (IMES) — and her coauthor, Elaine Okanyene Nsoesie of Boston University, offer a cautionary note about the prospects for AI in medicine. “If used carefully, this technology could improve performance in health care and potentially reduce inequities,” Ghassemi says. “But if we’re not actually careful, technology could worsen care.”

    It all comes down to data, given that the AI tools in question train themselves by processing and analyzing vast quantities of data. But the data they are given are produced by humans, who are fallible and whose judgments may be clouded by the fact that they interact differently with patients depending on their age, gender, and race, without even knowing it.

    Furthermore, there is still great uncertainty about medical conditions themselves. “Doctors trained at the same medical school for 10 years can, and often do, disagree about a patient’s diagnosis,” Ghassemi says. That’s different from the applications where existing machine-learning algorithms excel — like object-recognition tasks — because practically everyone in the world will agree that a dog is, in fact, a dog.

    Machine-learning algorithms have also fared well in mastering games like chess and Go, where both the rules and the “win conditions” are clearly defined. Physicians, however, don’t always concur on the rules for treating patients, and even the win condition of being “healthy” is not widely agreed upon. “Doctors know what it means to be sick,” Ghassemi explains, “and we have the most data for people when they are sickest. But we don’t get much data from people when they are healthy because they’re less likely to see doctors then.”

    Even mechanical devices can contribute to flawed data and disparities in treatment. Pulse oximeters, for example, which have been calibrated predominately on light-skinned individuals, do not accurately measure blood oxygen levels for people with darker skin. And these deficiencies are most acute when oxygen levels are low — precisely when accurate readings are most urgent. Similarly, women face increased risks during “metal-on-metal” hip replacements, Ghassemi and Nsoesie write, “due in part to anatomic differences that aren’t taken into account in implant design.” Facts like these could be buried within the data fed to computer models whose output will be undermined as a result.

    Coming from computers, the product of machine-learning algorithms offers “the sheen of objectivity,” according to Ghassemi. But that can be deceptive and dangerous, because it’s harder to ferret out the faulty data supplied en masse to a computer than it is to discount the recommendations of a single possibly inept (and maybe even racist) doctor. “The problem is not machine learning itself,” she insists. “It’s people. Human caregivers generate bad data sometimes because they are not perfect.”

    Nevertheless, she still believes that machine learning can offer benefits in health care in terms of more efficient and fairer recommendations and practices. One key to realizing the promise of machine learning in health care is to improve the quality of data, which is no easy task. “Imagine if we could take data from doctors that have the best performance and share that with other doctors that have less training and experience,” Ghassemi says. “We really need to collect this data and audit it.”

    The challenge here is that the collection of data is not incentivized or rewarded, she notes. “It’s not easy to get a grant for that, or ask students to spend time on it. And data providers might say, ‘Why should I give my data out for free when I can sell it to a company for millions?’ But researchers should be able to access data without having to deal with questions like: ‘What paper will I get my name on in exchange for giving you access to data that sits at my institution?’

    “The only way to get better health care is to get better data,” Ghassemi says, “and the only way to get better data is to incentivize its release.”

    It’s not only a question of collecting data. There’s also the matter of who will collect it and vet it. Ghassemi recommends assembling diverse groups of researchers — clinicians, statisticians, medical ethicists, and computer scientists — to first gather diverse patient data and then “focus on developing fair and equitable improvements in health care that can be deployed in not just one advanced medical setting, but in a wide range of medical settings.”

    The objective of the Patterns paper is not to discourage technologists from bringing their expertise in machine learning to the medical world, she says. “They just need to be cognizant of the gaps that appear in treatment and other complexities that ought to be considered before giving their stamp of approval to a particular computer model.” More