in

Don’t trust ChatGPT Search and definitely verify anything it tells you

кирилл поляшенко/Getty Images

In October, OpenAI integrated ChatGPT Search into ChatGPT, promising an experience in which users could browse the web and access the latest news from its news partners and sites that have not blocked OpenAI’s web crawler. A new review by Columbia’s Tow Center for Digital Journalism shows that the process may not be as efficient as it sounds. 

The Tow Center performed a test to determine how well publisher content is represented on ChatGPT. It selected 10 articles from 20 random publishers who partnered with OpenAI, are involved in lawsuits against OpenAI, or unaffiliated publishers who either allowed or blocked the web crawler. 

Also: ’12 Days of OpenAI’ promises product launches and demos – here’s how to watch

The researcher then extracted 200 quotes, which, when run among search engines like Google or Bing, pointed back to the source in the top three results. Finally, it was time to let ChatGPT identify the quotes’ sources. Ultimately, the goal was to see if the AI accurately serves publications, giving them credit for their work. If the approach worked as advertised, it should be able to attribute the sources just as well. 

<!–>

The results varied in accuracy, some entirely correct or incorrect, and some partially correct. Yet, nearly all answers were presented confidently, without the AI saying it couldn’t produce an answer even from publishers who had blocked its web crawler. Only in seven of the outputs did ChatGPT say to use words or phrases that insinuated it was unclear, as seen below: 

–> <!–>
Tow Center

“Beyond misleading users, ChatGPT’s false confidence could risk causing reputational damage to publishers,” the article stated. 

Also: This new AI podcast generator offers 32 languages and dozens of voices – for free

That statement was backed up by an example in which ChatGPT inaccurately attributed a quote from the Orlando Sentinel to a Time article, with over a third of ChatGPT’s responses with incorrect citations being of that nature. In addition to harming traffic, misattribution can harm a publication’s brand and trust with its audience. 

Other problematic findings from the experiment include ChatGPT citing an article from The New York Times, which has blocked it, from another website that had plagiarized the article, or the citing of a syndicated version of a piece from MIT Tech Review instead of the original article, although MIT Tech Review does allow crawling to take place. 

Also: ’12 Days of OpenAI’ promises product launches and demos – here’s how to watch

Ultimately, this research points to a larger question of whether or not partnering with these AI companies offers publishers more control and whether creating new AI search engines truly benefits publishers or hurts their businesses in the long run. The data behind the methodology is shared on GitHub and can be looked at by the public. 

Consumers should always verify the source by clicking on the footnote the AI provides or doing a quick search on an established search engine, such as Google. These extra steps will help prevent hallucinations.  

Publish a ton of research? You’ll love Bluesky – here’s why

29 gift ideas for your favorite open-source fan