Plain Language and Automation—Possibilities, Drawbacks, and the Reality of Large Language Models (Part I.)

2024.03.07.

Plain Language and Automation—Possibilities, Drawbacks, and the Reality of Large Language Models (Part I.)

There are perhaps few words that come up more often in public discourse about legal language than complexity and its synonym, difficulty of understanding. The question of how to (and whether it is even necessary to) make legal texts more comprehensible to the average person has been the subject of debate for decades. However, it is less clear how modern technologies can play a role in this issue, and to what extent can they be used for such purposes.

This is especially pronounced in the case of the Large Language Models (LLMs) we know today, which for many users represent the only direct link to “applied” Artificial Intelligence (AI). In other words, these mean the only way the average user can put AI to their own use directly. That is why in this post we will discuss, on the one hand, why it is important from a legal point of view to have clear communication, for example, from public authorities. On the other hand, we examine how today’s state-of-the-art technologies can be used to transcribe texts in a comprehensible way, and what knowledge they generally have about the guidelines of the Plain Language Movement. In the latter, we will be assisted by OpenAI’s ChatGPT chatbot and Google’s Gemini web interface.

The benefits of plain language can be approached from a bird’s eye view from two perspectives. The first is a practical reading of the issue. For instance, the financial benefits of communicating in plain language. The other is the moral dimension of the problem, i.e. what benefits it brings and what principles it promotes in the public sphere.

From a practical point of view, perhaps the most important thing is that easy-to-understand communication not only helps to reduce the number of complaints but also strengthens trust in companies and offices. It also has a positive effect on the general perception of the institution or company that uses it. Failure to understand the published written materials almost naturally leads to complaints. The time to deal with them can easily be saved if the understanding of individual information and communication materials is not hindered. In the case of state actors, it should also be mentioned here that the easiest way to promote law-abiding behavior is to avoid misunderstandings and problems arising from a lack of information. Easy-to-understand language is also an effective tool for this.

The moral aspects of the issue are inextricably linked, for example, to the right to a fair trial, a requirement of the rule of law. But, in addition, the expectation of transparency and the promotion of access to justice are aspects on which communication in plain language can have a significant impact.

The above-mentioned requirement of the right to a fair trial is found in Article 6(1) of the European Convention on Human Rights and reads as follows:

“In the determination of his civil rights and obligations or of any criminal charge against him, everyone is entitled to a fair and public hearing within a reasonable time by an independent and impartial tribunal established by law.”

This can also be linked to the linguistic dimension of individual procedures. If we treat comprehensibility as a basic requirement of the right to a fair procedure, it is easy to see that no procedure can be fair if its language is not easily understandable to the (lay) target audience. In fact, plain language will be one and perhaps the most effective means of reducing vulnerability in this case. From a legal point of view, the procedure cannot be fair if it is not from a linguistic and communication point of view. If, therefore, problems, deficiencies, and distortions arise in the interactions between lay people and legal professionals, it has a significant impact on the fact-finding process and, ultimately, on the outcome of the procedure, the judgment.

Indirectly linked to this is the problem of access to justice. In its original use, the definition was intended to draw attention to the unequal distribution of social resources (e.g. time, money, education, etc.), the absence of which can be an obstacle to invoking the law. This original definition can be extended to the fact that, in the absence of a (commonly) understood formulation, the ability of individuals to enforce the law may also be seriously impaired.

In addition, clear official texts not only help to strengthen legal certainty but also increase confidence in administrative processes and democratic advocacy. It is no coincidence, for example, that the use of language in public office communications has been regulated at the state level in the US since the enactment of the Plain Writing Act in 2010 and in Sweden since the Language Use Act of 2009. This is also closely linked to the fact that in these countries (perhaps more than in any other), citizens regard it as part of their democratic rights to have documents from public bodies easily understood by them.

Perhaps one of the most important findings in relation to transparency is that, according to the jurisprudential interpretation, part of the transparent public process is good communication. This includes consistent, understandable, and open communication about the functioning of the public authority and its organizational structure to inform society and stakeholders. Transparency is understood here as the principle that includes the client side of democratic rule of law requirements (e.g. due process), and therefore its link with accessibility is inescapable.

So, the above is a brief overview of the arguments that can be made in favor of using plain language as widely as possible. There are at least two main ways in which this widespread use can be facilitated. One is the holding of education, training, and attitude formation, which strengthens the awareness of the importance of the topic. There is now a long tradition of this in many EU countries, with specific organizations dedicated to the issue. But there are still some “laggards” where the issue is barely mentioned. The second is to provide tools that can help enforce the principles of comprehensibility in the drafting of official or individual legal documents. While we can perhaps agree that the first option is the most effective in the long term, it may be worthwhile to turn to automation in the short and medium term.

English, as the de facto lingua franca of the Euro-Atlantic area, is in a particularly privileged position in this respect. Just think of the tools that are easily accessible online, such as Grammarly, readable.com, or even the Hemingway Editor. These are tools of varying degrees of sophistication, but equally easy to use, which makes it easy to check the wording of documents for clarity.

“Small languages” are in a much less fortunate position. To go no further, for example, there is no freely available software tool for Hungarian that can perform a similar task. There are, of course, several ways to overcome this. One of them is to train custom machine-learning models. Such a model could work, for example, by trying to classify sentences or texts according to whether they are intelligible or not.

Here we are faced with two important problems. The first is the issue of training data, of which we obviously need a large quantity of good quality. In most cases, this is a bottleneck. The second is that even if such a model is produced, it will only be able to draw attention to the problematic points of a text. Of course, this alone can make drafting work faster and more efficient, but it does not provide any suggestions as to what changes might be needed in each text. This requires a separate set of handwritten rules, which is a very complex linguistic and IT task.

With LLMs, we know that they have been trained on a large corpus. Therefore, they have a lot of knowledge about language that can be useful in solving such a complex problem. So, the question rightly arises: what do today’s state-of-the-art tools have to do with the question of plain language? In the second part of this article, we will look for the answer to this question by means of some practical experiments.

István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the HUN-REN Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.

General, Tech & AI

Plain Language and Automation—Possibilities, Drawbacks, and the Reality of Large Language Models (Part I.)

Previous post

Next post

Plain Language and Automation—Possibilities, Drawbacks, and the Reality of Large Language Models (Part I.)

Previous post

Next post

Related Posts

The Right to Be Forgotten as a Legal Instrument for Expanding Informational Self-Determination in the Digital Age?

The Freedom of the Mandate vs. Last Minute Modifications of the EP’s Rules of Procedure Ahead of the European Elections

EU AI Act: Some Considerations to Think About—Part II.