We’re already trusting AI with too much – I just hope AI hallucinations disappear before it’s too late

I was talking to an old friend about AI – as one often does whenever engaging in causal conversation with anyone these days – and he was describing how he’d been using AI to help him analyze insurance documents. Basically, he was feeding almost a dozen documents into the system to summarize or maybe a pair of lengthy policies to compare changes. This was work that could take him hours, but in the hands of AI (perhaps ChatGPT or Gemini, though he didn’t specify), just minutes.
What fascinated me is that my friend has no illusions about generative AI’s accuracy. He fully expected one out of 10 facts to be inaccurate or perhaps hallucinated and made it clear that his very human hands are still part of the quality-control process. For now.
The next thing he said surprised me – not because it isn’t true, but because he acknowledged it. Eventually, AI won’t hallucinate, it won’t make a mistake. That’s the trajectory and we should prepare for it.
You may like
The future is perfect
I agreed with him because this has long been my thinking. The speed of development essentially guarantees it.
While I grew up with Moore’s Law, which posits a doubling of transistor capacity on a microchip roughly every two years, AI’s Law is, putting it roughly, a doubling of intelligence every three-to-six months. That pace is why everyone is so convinced we’ll achieve Artificial General Intelligence (AGI or human-like intelligence) sooner than originally thought.
I believe that, too, but I want to circle back to hallucinations because even as consumers and non-techies like my friend embrace AI for everyday work, hallucinations remain a very real part of the AI, Large Language Model (LLM) corpus.
In a recent anecdotal test of multiple AI chatbots, I was chagrinned to find that most of them could not accurately recount my work history, even though it is spelled out in exquisite detail on Linkedin and Wikipedia.
These were minor errors and not of any real importance because who cares about my background except me? Still, ChatGPT’s 03-mini model, which uses deeper reasoning and can therefore take longer to formulate an answer, said I worked at TechRepublic. That’s close to “TechRadar,” but no cigar.
DeepSeek, the Chinese AI chatbot wunderkund, had me working at Mashable years after I left. It also confused my PCMag history.
Google Gemini smartly kept the details scant, but it got all of them right. ChatGPT’s 4o model took a similar pared-down approach and achieved 100% accuracy.
Claude AI lost the thread of my timeline and still had me working at Mashable. It warns that its data is out of date, but I did not think it was 8 years out of date.
What percentage of AI answers do you think are hallucinations?March 24, 2025
I ran some polls on social media about the level of hallucination most people expect to see on today’s AI platforms. On Threads, 25% think AI hallucinates 25% of the time. On X, 40% think it’s 30% of the time.
However, I also received comments reminding me that accuracy depends on the quality of the prompt and topic areas. Information that doesn’t have much of an online footprint is sure to lead to hallucinations, one person warned me.
However, research is showing that models are not only getting larger, they’re getting smarter, too. A year ago, one study found ChatGPT hallucinating 40% of the time in some tests.
According to the Hughes Hallucination Evaluation Model (HHEM) leaderboard, some of the leading models’ hallucinations are down to under 2%. Older models like Meta Llama 3.2 are where you can head back into double-digit hallucination rates.
Cleaning up the mess
What this shows us, though, is that these models are quickly heading in the direction my friend predicts and that at some point in the not-too-distant future, they will have large enough models with real-time training data that put the hallucination rate well below 1%.
My concern is that in the meantime, people without technical expertise or even an understanding of how to compose a useful prompt are relying on large language models for real work.
Hallucination-driven errors are likely creeping into all sectors of home life and industry and infecting our systems with misinformation. They may not be big errors, but they will accumulate. I don’t have a solution for this, but it’s worth thinking about and maybe even worrying about a little bit.
Perhaps, future LLMs will also include error sweeping, where you send them out into the web and through your files and have them cull all the AI-hallucination-generated mistakes.
After all, why should we have to clean up AI’s messes?
You might also like
I was talking to an old friend about AI – as one often does whenever engaging in causal conversation with anyone these days – and he was describing how he’d been using AI to help him analyze insurance documents. Basically, he was feeding almost a dozen documents into the system…
Recent Posts
- Apple’s iOS 18.4 update with AI-powered Priority Notifications is almost here
- We’re already trusting AI with too much – I just hope AI hallucinations disappear before it’s too late
- Sony’s WH-1000XM5 wireless headphones are now available for $150 less
- Latest Meta Quest 3 software beta teases a major design overhaul and VR screen sharing – and I need these updates now
- Alienware’s AW2725DF OLED gaming monitor is a steal at $599.99
Archives
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010