Submission
Privacy Policy
Code of Ethics
Newsletter

AI Plus Humans or AI Without Humans: Where Does the Deloitte Model Fail? – Part II.

The Deloitte incident (see Part I.) shows that the real risk of generative AI in professional environments is not the technology itself, but the institutional conditions under which it is deployed. AI does not eliminate responsibility—it merely reshapes where responsibility lies. Without governance, training, and enforceable accountability, organizations will continue to mistake automation for expertise.

The dilemmas described above are often driven by very mundane organizational incentives. Consulting firms and corporations are under strong pressure to deliver complex analyses as quickly and as cost-effectively as possible. Generative AI is a strong temptation to outsource at least part of the tedious work of literature review, data collection and drafting to a machine. The line between innovation and laziness, however, often only becomes visible in hindsight: the very same tool can amount to smart automation or to cutting corners intellectually. The Deloitte cases suggest that when tight deadlines and cost-cutting override traditional professional standards, we may quickly end up in a situation where, under the label of “innovation,” we are in fact outsourcing our basic duty of care to a probabilistic model.

In corporate practice, this tendency is reinforced by the phenomenon often referred to as shadow AI. International surveys indicate that a significant share of employees use some form of generative AI in their daily work without any formal authorization and without IT or compliance being aware of it. Many draft internal emails or prepare analyses in personal ChatGPT or other language model accounts, and some even rely on public AI services to support financial decision-making, all below the organization’s radar. It is therefore no surprise that some forecasts suggest that by 2030 around 40 percent of companies will have experienced a security or compliance incident directly linked to shadow AI. The real issue is not that employees break the rules, but that institutions often refuse to provide sanctioned tools; thus, shadow AI is less “rogue behavior” and more a symptom of organizational design failures. Beyond concerns related to organizational culture, the shadow-AI phenomenon creates overlapping data protection, cybersecurity and reputational risks, since sensitive information may be fed into external systems and decision-making may start to rely on content whose origin and reliability are completely opaque. This is also where the notion of “responsibility gaps,” well-established in AI ethics literature, becomes relevant: the Deloitte case illustrates that these gaps do not arise from the technology itself but from organizational choices that fail to assign, monitor and enforce responsibility. Moreover, consultancy–government relationships often externalize the risks of AI-assisted analysis onto public clients while internalizing the revenue, creating an asymmetric incentive structure that favors experimentation with AI even in high-stakes contexts.

The counterpoint is provided by controlled, organization-wide AI solutions. One example is Retrieval Augmented Generation, where the model does not answer “from memory,” but relies on documents from the organization’s own vetted knowledge base or from approved public sources. In practice, this means that when the system receives a question, it first retrieves relevant documents from internal repositories or authorized external databases, and only then combines their content with the generative capabilities of the language model. There is now strong evidence in multiple fields, that such systems can significantly reduce the frequency of hallucinations, especially when the question concerns facts that are well covered in the underlying knowledge base.

It is important to stay realistic here as well. Neither RAG nor any other technique will make fabricated information disappear entirely from LLM outputs at a stroke. Poor retrieval—i.e. finding adequate information—quality, incomplete repositories or weak ranking still produce wrong answers—sometimes more persuasive ones, because they appear to cite plausible sources. They may simply look more convincing because they are backed by sources that appear, at least on the surface, to fit. Foundational research in machine learning shows that no retrieval method can fully remove the statistical uncertainty inherent to large language models; external knowledge can reduce errors but cannot eliminate them. From this angle, the Deloitte case is a reminder that the mere fact that a model can produce a “source” and a “footnote” tells us nothing about whether that source exists or whether the model has interpreted it correctly.

In practice, the middle ground is not to ban AI from the organization, but it is not to let it loose and rely on “common sense” either. Experience shows that if there are no official, secure and controlled AI tools, employees will find workarounds anyway. A much more promising strategy is for the organization to build its own audited AI infrastructure, with clear rules and training programmes attached. In concrete terms, this means spelling out which workflows and risk categories allow AI use, which types of data must never be fed into the system, and where human review is mandatory. For low-risk tasks such as internal communication or formatting, spot checks may be enough. For medium-risk use, for example when preparing background analysis, experts may need to review the model’s suggested sources and arguments in detail. For high-risk documents with legal or regulatory implications, full, documented human validation is required, and in some cases an independent second opinion as well.

Another, equally important layer of the Deloitte case is the question of accountability. Whose work is a report that is partly generated by AI? Does it belong to the model, to the consultant, to the engagement partner, to quality assurance, or ultimately to the government body that approves the document? Current legal and ethical frameworks are largely clear in treating AI as a tool rather than an independent subject of rights and duties. In practical terms, this means that if a report contains an invented judicial quotation, it is not “the model” that has done this, but the firm and the team that chose to include the model’s output in the final document without proper checks. In principle, responsibility runs up through the company’s governance structure all the way to the board, especially where the use of AI has already become strategically significant. It is no coincidence that an increasing number of guidelines and academic proposals stress that AI expertise should be explicitly represented on boards and supervisory bodies, and that these bodies should actively oversee the introduction and operation of such systems.

From the perspective of the public sector, accountability has even more tangible consequences. In the Canadian case, local politicians and union leaders pointed out that if an ordinary civil servant had produced a document with this level of error, it would likely have led to disciplinary action, whereas for a large international consulting firm the consequence was only a partial refund. This double standard gradually undermines trust in public institutions and reinforces the narrative that governments outsource their work to expensive consultancies that in practice “just run the text through an AI.” If citizens feel that the decisions affecting them are not based on genuine consultation and expertise but on AI-generated “research,” the democratic legitimacy of those decisions is weakened.

This brings us back to the starting question: what is the role of generative AI in research and analytical work today? It is helpful to stick to the view that AI is, and should remain, a productivity tool. It belongs in the same category as electronic legal databases, spreadsheets or search engines. When these tools appeared, no one seriously believed that they would remove the responsibility of lawyers, economists or researchers for the quality of their own work. We should not treat AI any differently. Its role is not to replace humans, but to increase human capacity on certain routine tasks and free up time for genuine thinking, analysis and judgement. The Deloitte case shows what happens when this balance is lost and AI stops being a tool and becomes a convenient way to escape the burden of responsibility.

The question is no longer whether AI will enter professional workflows—it already has. The real question is whether organizations can design governance structures that harness AI’s advantages while preventing its predictable failures from becoming institutional failures. The Deloitte case teaches us that AI does not diminish professional responsibility; it magnifies the consequences of neglecting it.

For the coming years, the key question is therefore not whether AI “may” be used in professional content creation, but under what institutional and organizational conditions it can be embedded responsibly into everyday practice. This requires organizations to recognize that hallucinations are not a temporary glitch but an in-principle unavoidable consequence of how probabilistic models generate text. Consequently, they need to build governance systems that are explicitly designed so that the inevitable errors of AI are caught by human and technical control mechanisms. Only then can we get closer to the often invoked, but rarely taken seriously, vision in which AI and humans do not stand in opposition, but work together to achieve a higher level of professional performance that is both transparent and accountable.


István ÜVEGES, PhD is a Computational Linguist researcher and developer at MONTANA Knowledge Management Ltd. and a researcher at the HUN-REN Centre for Social Sciences. His main interests include the social impacts of Artificial Intelligence (Machine Learning), the nature of Legal Language (legalese), the Plain Language Movement, and sentiment- and emotion analysis.