Beware of the Accuracy of AI Chatbots

Chatbots such as ChatGPT can be a useful source of information, but how accurate is AI-generated content?

The arrival of DeepSeek, the new AI platform from China, offers another example of why we need to be careful when using the output from AI chatbots. In particular, when asking the chatbots questions or generating content, the accuracy of their output is not always accurate, often incomplete, sometimes wrong and even made up!

This point is illustrated quite clearly when you ask DeepSeek about a subject sensitive to the Chinese Government, such as the events around Tiananmen Square. In such circumstances it provides a generic response, “Sorry, that’s beyond my scope. Let’s talk about something else.” Such a response clearly illustrates that the AI is filtering responses. So how can you believe any response from DeepSeek or indeed any other AI when you are trying to get factual information?

Well, the answer is, you can’t!

You really should not be using AI as your definitive or only research tool. If you do, you need to be knowledgeable in the subject you are asking it for answers on, and use alternative sources to fact check the response, not only for accuracy but to make sure it hasn’t missed key information. One function you should use in any such searches using chatbots is to ask it to provide the references for the information used in its response. However, be warned, they do have reputation for making up some references!

To further illustrate the point, the BBC has assessed the ability of leading AI chatbots to correctly summarise the news. In the study, the BBC asked ChatGPT, Copilot, Gemini and Perplexity to summarise 100 news articles, with experts in the subject of the article rating the accuracy of the answers. 51% were found to have significant issues, with 19% of answers having introduced factual errors. The study found Copilot (Microsoft) and Gemini (Google) to produce summaries with more significant issues than the other chatbots.