Depending on robots to impart news from actual sources may not be the best choice. BBC recently investigated the news capabilities of OpenAI, Google Gemini, Perplexity, and Microsoft Copilot.
The investigation revealed that a staggering 51% of all AI responses concerning news topics contain some significant issues of some form. The study involved posing 100 news-related questions to each bot, making use of BBC sources whenever possible.
Their responses were then evaluated by the journalists who are experts in the subject of the article. Some notable inaccuracies were noted in the answers given by AI chatbots.
Gemini incorrectly stated that the UK’s National Health Service (NHS) does not endorse vaping as a smoking cessation method. Both ChatGPT as well as Copilot mistakenly claimed that certain politicians who have stepped down were still in office.
Perpexility generated even more troubling results, it misinterpreted a BBC article concerning Iran and Israel, stating opinions of the authors and sources that were never expressed.
When it comes to its articles, the BBC reported that 19% of the AI-generated summaries consisted of factual inaccuracies, incorrect figures and data, and fabricating false statements. Moreover, 13% of direct quotes were either modified from the source or were completely absent from the cited article.
The inaccuracies were not evenly spread among the bots, though this may offer little solace as one of them performed extremely well. According to the BBC, “Microsoft’s Copilot and Google’s Gemini exhibited more significant issues than OpenAI’s ChatGPT and Perplexity.”
Yet both Perplexity and ChatGPT still faced issues with over 40% of their responses. In a blog post, BBC CEO Deborah Turness expressed strong criticism of the companies that were involved citing that current applications of AI are playing with fire.
Turness remarked, “We live in troubled times. How long will it be before an AI-distorted headline leads to serious real-world consequences?”