
AI Accuracy Study
A study conducted by the European Broadcasting Union (EBU) and the BBC, published on Wed, Oct 25, 2023, evaluated the accuracy of over 2,700 responses from AI models including OpenAI’s ChatGPT, Google’s Gemini, Microsoft’s Copilot, and Perplexity.
Study Methodology
Twenty-two public media outlets from 18 countries and 14 languages participated in this study. A consistent set of questions was posed to the AI assistants between late May and early June 2023.
Findings
- 45% of responses contained at least one significant issue.
- Sourcing was the most frequent problem, affecting 31% of responses, with issues such as unsupported information, incorrect, or unverifiable attribution.
- Lack of accuracy affected 20% of responses, while the absence of appropriate context impacted 14% of responses.
- Gemini exhibited the most significant sourcing issues, affecting 76% of its responses.
- All AI models examined made basic factual errors. Notable errors include Perplexity incorrectly stating the legality of surrogacy in Czechia and ChatGPT incorrectly identifying the current pope.
Response from Technology Firms
OpenAI, Google, Microsoft, and Perplexity have not provided immediate comments regarding the study’s findings.
Recommendations
In the report’s foreword, Jean Philip De Tender, the EBU’s deputy general, and Pete Archer, the head of AI at the BBC, emphasized the need for technology firms to enhance the accuracy of their AI products. They urged these firms to prioritize addressing errors and to regularly publish accuracy results by language and market.















