Governance3 views5 min read

Deloitte to Refund Australian Government Over Flawed AI Report

Deloitte will provide a partial refund to the Australian government after a $440,000 report it produced using AI was found to contain fabricated references.

Isabella Rossi
By
Isabella Rossi

Isabella Rossi is a legal affairs correspondent for Neurozzio, specializing in technology regulation, labor law, and the impact of artificial intelligence on corporate compliance. She reports on the evolving legal landscape facing modern businesses.

Author Profile
Deloitte to Refund Australian Government Over Flawed AI Report

Consulting firm Deloitte will issue a partial refund to the Australian federal government for a $440,000 report that contained significant errors. The firm later admitted to using a generative artificial intelligence tool to help produce the document, which included fabricated references and incorrect citations.

The report was an independent review of the Department of Employment and Workplace Relations' (DEWR) welfare compliance system. Despite the corrections, the department has stated that the core findings and recommendations of the review remain unchanged.

Key Takeaways

  • Deloitte is providing a partial refund to the Australian government for a $440,000 report on its welfare compliance framework.
  • The original report contained numerous errors, including nonexistent references, which were identified as AI "hallucinations."
  • Deloitte confirmed it used an Azure OpenAI GPT-4o large language model in the report's preparation.
  • While the report was corrected, the government department states the substantive findings and recommendations were not altered.
  • The incident has prompted criticism from officials regarding the oversight of work produced by major consulting firms.

Details of the Flawed Government Report

In late 2023, the Department of Employment and Workplace Relations commissioned Deloitte to conduct an important review. The task was to assess the targeted compliance framework and its associated IT system, which automates penalties for jobseekers who do not meet their obligations.

The contract, valued at $440,000, resulted in a report that identified serious problems within the welfare system. According to the review, the IT system operated on "punitive assumptions" about welfare recipients and lacked clear traceability between its rules and the governing legislation.

However, after its initial publication on July 4, significant inaccuracies were discovered within the document. This led to the report being taken down and later re-uploaded with corrections.

AI Hallucinations and Fabricated Sources

The errors were first brought to public attention by Dr. Christopher Rudge, an academic from the University of Sydney. He described the inaccuracies as classic examples of AI "hallucinations," a phenomenon where AI models generate false information when they cannot find the correct data.

"Instead of just substituting one hallucinated fake reference for a new ‘real’ reference, they’ve substituted the fake hallucinated references and in the new version, there’s like five, six or seven or eight in their place," Dr. Rudge stated.

He argued this suggests the original claims made in the report were not based on a single, solid piece of evidence. The Australian Financial Review later reported specific examples of these fabrications.

Examples of Fabricated Information

  • A nonexistent report attributed to professors at the University of Sydney.
  • A fabricated reference to a study from Lund University in Sweden.
  • An incorrect summary of a significant court decision related to the 'robodebt' scandal, Deanna Amato v Commonwealth.

These findings raised serious questions about the research and verification processes used in the creation of a high-cost government report.

Deloitte's Response and Admission of AI Use

In the revised version of the report, Deloitte acknowledged the errors and made corrections. The firm also added a disclosure in the appendix about its methodology.

The updated text reads: "included the use of a generative artificial intelligence (AI) large language model (Azure OpenAI GPT – 4o) based tool chain licensed by DEWR and hosted on DEWR’s Azure tenancy.”

Despite this admission, Deloitte did not explicitly state that the use of AI was the direct cause of the errors. In a statement within the amended document, the firm maintained that the core integrity of its work was sound.

"The updates made in no way impact or affect the substantive content, findings and recommendations in the report," Deloitte stated. A spokesperson confirmed that "the matter has been resolved directly with the client."

Understanding AI Hallucinations

AI hallucinations occur when a large language model (LLM) generates text that is nonsensical, factually incorrect, or disconnected from the provided source material. This happens because the AI's primary goal is to predict the next most likely word in a sequence to form coherent sentences, not necessarily to ensure factual accuracy. It may invent sources, studies, or legal cases that sound plausible but do not exist.

Government Reaction and Calls for Oversight

The incident has sparked criticism from government officials. The Department of Employment and Workplace Relations confirmed Deloitte would repay the final installment of its contract fee, with the exact amount to be made public once the transaction is complete.

Labor senator Deborah O’Neill, a member of a senate inquiry into consulting firms, was highly critical of the firm's performance.

"Deloitte has a human intelligence problem. This would be laughable if it wasn’t so lamentable. A partial refund looks like a partial apology for substandard work," she said.

Senator O'Neill stressed the need for greater accountability when engaging external consultants. "Anyone looking to contract these firms should be asking exactly who is doing the work they are paying for, and having that expertise and no AI use verified," she added. Her comments reflect a growing concern about the transparency and reliability of work outsourced to large consulting agencies, especially with the increasing integration of AI tools.

While Dr. Rudge was critical of the report's methodology, he hesitated to dismiss its conclusions entirely, noting that they align with other widespread evidence about the welfare system's flaws. The DEWR also stands by the review's recommendations, indicating that while the process was flawed, the outcome may still inform future policy.