Like a Kid Guessing Answers to an Exam, AI is prone to Making Stuff Up

"Like students facing a hard exam question, large language models sometimes guess when its uncertain, producing plausible yet incorrect statements instead of admitting uncertainty,” according to OpenAI. The researchers acknowledged that LLMs will always produce hallucinations due to fundamental mathematical constraints that cannot be solved through better engineering. They observed that “Such ‘hallucinations’ persist even in state-of-the-art systems and undermine trust.”

My reaction? Well, duh. Anyone who has worked with AI knows it hallucinates. My image prompt for this story was “young student holding a pencil daydreaming while taking a test in an old classroom from the 50s.” Midjourney got the pencil right but dressed my 8-year-old like he was off to prom after guessing his way through his math test.

And it is not just pictures. Over the weekend, I spent an entire day trying to get ChatGPT to build a pro forma. I thought I could take a shortcut. I could not. The results were so inconsistent that I gave up and built the spreadsheet from scratch.

The researchers concluded, “Unlike human intelligence, it lacks the humility to acknowledge uncertainty,” said Neil Shah, VP for research at Counterpoint Technologies. “When unsure, it doesn’t defer to deeper research or human oversight. Instead, it often presents estimates as facts.

”So, can you train AI to stop guessing and admit when it does not know the answer? Based on this research, the answer is no. You still need an adult to grade the test and explain what went wrong.

That said, there is a path forward. While public models like ChatGPT can be unreliable, private GPTs built on your organization’s own data and guardrails can reduce hallucinations and stay grounded in reality. They do not guess outside what they know. If you want to explore how that works in a secure and meaningful way, let’s talk. Contact Michelle Fink at Hudson technology.

#MSP #MSSP #IT #AI

Next Blog Post