When a 2022 Hallucination Burn Came Back: Comparing GPT-4.1 and GPT-5 After Gemini 2.0 Flash’s 0.7% Claim
https://wiki-velo.win/index.php/What_if_everything_you_knew_about_OpenAI_o3-mini_accuracy_change,_Vectara_benchmark_versions,_and_document-length_impact_was_wrong%3F
How a biased vendor claim forced a product team to retest summarization models In 2022 our team implemented GPT-3.5 for summarization tasks inside a legal-document intake pipeline