OpenAI’s o3, o4-mini reasoning AI models hallucinate more

OpenAI recently released o3 and o4-mini artificial intelligence (AI) models have been found to hallucinate more in contrast to the older OpenAI models.

Hallucinations have remained a major challenge to resolve in AI, impacting modern and best-performing systems.

OpenAI’s internal tests indicated that o3 and o4-mini hallucinate more as compared to the previous models, including o1, o1-mini, and o3-mini, and “non-reasoning” models.

OpenAI stated, “Specifically, o3 tends to make more claims overall, leading to more accurate claims as well as more inaccurate/hallucinated claims,” as reported by TechCrunch.

The ChatGPT manufacturer found that o3 hallucinated while responding to 33% of questions on PersonQA, while the 04-mini hallucinated 48% of the time.

The hallucination rate is double the rate of the company’s older reasoning models, such as o1 and o3-mini.

An effective strategy to boost the precision of models is to give them web search capabilities.

OpenAI’s GPT-4o with web search accomplishes 90% accuracy on SimpleQA. Potentially, the search could enhance reasoning models’ hallucination rates too.

Last year, AI transitioned towards reasoning models for enhanced performance with reduced data and computing, though this shift may increase hallucinations, posing a significant challenge.

While hallucinations can assist models in being innovative, they also reduce their suitability for business that needs enhanced accuracy.

11 minutes ago

0 1 1 minute read

OpenAI’s o3, o4-mini reasoning AI models hallucinate more

OpenAI found o3-mini to hallucinate while responding to 33% of questions on PersonQA

Read Next

WhatsApp rolls out new feature for channel owners

College spring breaker charged in firebombing of Tesla cybertrucks in Missouri

Hacking group Anonymous targets Russia, releases Trump files

Google cuts hundreds of jobs in Android and Pixel teams

AI-generated barbie dolls flood social media — are you looking to make your own doll?

WhatsApp rolls out new feature for channel owners

College spring breaker charged in firebombing of Tesla cybertrucks in Missouri

Hacking group Anonymous targets Russia, releases Trump files

Google cuts hundreds of jobs in Android and Pixel teams

AI-generated barbie dolls flood social media — are you looking to make your own doll?

Leave a Reply Cancel reply

Read Next

WhatsApp rolls out new feature for channel owners

College spring breaker charged in firebombing of Tesla cybertrucks in Missouri

Hacking group Anonymous targets Russia, releases Trump files

Google cuts hundreds of jobs in Android and Pixel teams

AI-generated barbie dolls flood social media — are you looking to make your own doll?

Related Articles

Leave a Reply Cancel reply