Submission
Privacy Policy
Code of Ethics
Newsletter

Managing AI Agents: Risk, Compliance, and Responsible Deployment (Part II.)

In the first part, we explored how the OpenAI Agents SDK simplifies the development and deployment of intelligent agents, and what key legal and data protection considerations must be considered when implementing them. Now, we are taking a deeper dive: we will look at how new research findings can help us not only monitor the operation of AI agents but regulate them at the level of their innermost processes.

A concept based on such software agents could be ideal for organizations seeking to automate repetitive tasks, such as contract review, risk analysis, or certain customer service operations. However, ensuring legal compliance, proper design, and successful deployment involves a range of considerations that can only be professionally addressed if we also understand how the underlying large language models (LLMs) work.

According to the latest research, these models do not “just” predict the next word, but also employ complex, sometimes almost human-like internal logic. However, this discovery is not only of scientific interest: it also has serious practical significance. If we can understand and regulate these internal mechanisms, we can integrate LLMs into corporate processes much more efficiently and legally.

However, to truly take advantage of these new opportunities, it is first worth keeping up to date with what we currently know about the operation of LLMs. Large Language Models (such as those represented by GPT systems or Anthropic Claude’s models) traditionally work by learning linguistic patterns from a significant amount of textual data. The apparent “magic” is the result of many years of research and development. The models are increasingly able to continue texts, interpret context, and, if necessary, generate code, etc. However, looking beneath the surface, it turns out that networks do not simply operate according to a tabular statistical method: the latest results – for example, circuit tracing (mapping internal connections) and dictionary learning (discovering internal “features”) – show that the internal mechanisms of the models are complex, sometimes showing signs of a multi-step “thinking” process.

But what does all this have to do with legal use and enterprise adoption? One possible answer is the Agents SDK. This toolkit gives you more control over the inner workings of your models – for example, by setting up guardrails around what agents do.

Imagine a corporate legal department reviewing hundreds of contracts a month, and at various points in time, there may be privacy or regulatory concerns related to the AI ​​being used. For example, if an AI agent is given too much freedom in the process, it can easily misinterpret data or give false risk signals, with serious consequences. For example, the Agents SDK offers the ability to define exactly what data the model can use, when human review is required, and what types of content should be automatically blocked.

The new research results already mentioned, such as the circuit tracing method, have also pointed out how the internal processes of models can be structured during the generation of an output. When analyzing a legal document, an LLM may run several seemingly unrelated subprocesses – it may activate a hidden mathematical approximation module, a multilingual reference search, or even an internal feature that handles security information.

This may seem like a technical detail at first, but from a legal perspective it is extremely important. If we know that certain modules of the model tend to fill the text with incorrect data (so-called hallucinations), then we can (in theory) intervene in the process, eliminating these sidetracks. For example, with a toolkit like the Agents SDK, we can set filters that only allow the agent to make the final proposal if the internal modules have not indicated uncertainty or contradiction. It should be noted that in the case of OpenAI’s current solution, the current documentation does not indicate the possibility of this level of control, but this could be a logical next direction for developments.

Studies using dictionary learning methods show that the internal workings of language models are much more systematic than previously thought. This technique has shown that the repetitive activation patterns of artificial neurons – which we call “features” – are often linked to well-interpreted human concepts. The essence of dictionary learning is that it builds a dictionary from these recurring patterns: just as words are made up of letters and sentences are made up of words, each feature is created by the cooperation of several neurons, and the current “thinking state” of the model is described by the set of these features. This is important because it is not about the (seemingly) chaotic operation of billions of neurons, but more transparent, recurring units, many of which can be specifically linked to specific topics – for example, contractual clauses or data protection rules in a legal environment.

This also means that with the right prompts and tools, the agent (in theory) could highlight legally relevant details, cross-reference them with internal databases, or automatically block sensitive data when appropriate.

From a compliance perspective, it is particularly important that LLMs operate in line with the requirements of GDPR and other data protection regulations. We already have some control over this in the Agents SDK. For example, we can prohibit the model from processing certain types of personal data at all. This can reduce the risk of AI violating data protection or other industry compliance regulations.

During the implementation, it is also worth considering the advantages of multi-agent architectures. Several frameworks that exist today allow multiple specialized agents within a system to work together to solve a complex task. For example, there could be a separate agent for reviewing data protection contracts, another for managing financial risks, and a third for researching legal precedents. If an agent gets stuck – for example, because the issue is beyond their scope – the task can be automatically handed over to another, more competent agent. This modular approach increases the reliability of the system, and legal teams can also more clearly track which agent is responsible for which step. This is also an important step towards the interpretability of decisions.

Even with the best AI oversight, human control is still needed, as LLMs can contain unknown and risky elements that cannot always be reliably filtered out by automated filters. These can range from harmless templates to unauthorized data processing mechanisms that pose legal risks. Therefore, it is essential to have human control at critical points, such as when finalizing contracts or preparing legal proposals.

Legal responsibility ultimately lies with the organization operating the system. That is why it is necessary to log all agent operations, record which agent made which changes, and allow them to be undone if necessary.

If these aspects can be consistently integrated, companies using agents can gain a real competitive advantage. By automating monotonous, repetitive tasks, decision-making becomes faster, while employees can focus on higher-value, strategic activities. However, this cannot be done at the expense of security or legal compliance. In theory, methods such as circuit tracing and dictionary learning could provide developers and legal departments with new opportunities to gain precise insight into what is happening under the model-level “hood.” This way, instead of blindly entrusting key processes to a black box of AI, the organization can proactively control which internal functions are active, what data the system handles, and where human intervention is required.

The OpenAI Agents SDK and similar solutions make the implementation of AI agents more accessible to most companies. However, this implementation must consider both technological and legal aspects. Recent research has shown that language models are much more complex than previously thought – but this complexity can also offer new opportunities for regulated use, industry-specific fine-tuning and automation. Organizations that consciously build on the latest tools can create AI systems that are faster, more cost-effective and compliant with the most stringent legal requirements.


István ÜVEGES, PhD is a Computational Linguist researcher and developer at MONTANA Knowledge Management Ltd. and a researcher at the HUN-REN Centre for Social Sciences. His main interests include the social impacts of Artificial Intelligence (Machine Learning), the nature of Legal Language (legalese), the Plain Language Movement, and sentiment- and emotion analysis.