István ÜVEGES: Ethical Machines – the Problem of Right Decision and Morally Right Decision in Machine Learning

As AI-powered systems permeate deeper into our lives, the specter of ethics looms larger than ever before. The technology’s ability to make decisions has opened Pandora’s box of moral questions. Can machines truly discern right from wrong? From trolley problems to cultural nuances, the journey towards ethical AI is riddled with challenges.

The moral implications of AI decisions have only recently been investigated in a meaningful way, and the results in some cases are contradictory.

Although the concept of robots walking among humans and interacting with them has been around since ancient mythology, it is only today that artificial intelligence capable of intervening in the functioning of the social order has become a reality.  As we rely more and more on technology in our everyday lives, the question arises: will future AI-based solutions be able to make not only correct but also ethical decisions? In this post, we therefore briefly review the questions that arise in the context of ethical AI, as well as the main problems that are currently known, and the possible answers to them.

By 2023, AI-based solutions are present in more and more areas of our lives, some obvious, others more hidden. AI algorithms can be found in completely innocuous environments, such as spam filters in email clients, content recommendation systems in media providers, or even chatbots in the customer service departments of companies. While it is generally true that they help us to make decisions that make our lives easier or more convenient, none of these have a significant impact on our future. However, not all the applications we already know are so ‘harmless’. For example, social media platforms employ AI algorithms that can trap us in opinion bubbles by strongly prioritizing content based on our interests. Additionally, weaponized AI can be used to support activities such as election manipulation.

Based on current trends, the capabilities of machine learning-based systems are expected to continue to evolve, and their prevalence is expected to increase soon. Although the emergence of Artificial General Intelligence (AGI) is still decades away according to the most optimistic estimates, even existing solutions can pose social, financial, and security risks in irresponsible hands. In the (probably not too distant) future, AI will have to make decisions in situations where the outcome will have a significant impact on our daily lives. This raises the question of whether it is possible to encode information/knowledge into these systems that in some way maps the values associated with morally correct or ethical choices from a human perspective.

Current machine learning systems are based on data (training data), which is used by an algorithm to learn the patterns in the data and then to make ‘decisions’ in new situations. In its simplest approach, the creation of ethical AI can be imagined in a roughly similar way. Given a huge amount of training data, we can assume that the algorithm can learn the patterns that characterize human morality (without, of course, the self-reflection and empathy that are currently capabilities of humans only). To better understand the issue, it is worth starting from an existing practical example.

The Moral Machine Experiment was a 2018 project in which philosophers and data scientists collected around 40 million responses from more than 200 countries to the famous ‘trolley problem‘, perhaps one of the most famous thought experiments. Here, the decision-maker is tasked with making an ‘impossible’ decision in accordance with their own moral principles. In the basic situation, the respondent is driving a trolley when he notices that the brakes on the vehicle have failed. On approaching a turnout, he discovers that if he continues his current route, he will collide with five people on the track, who will die in the encounter. By changing the switch, he can divert the vehicle to another track, but there is also someone standing on it, and this decision causes their death. The question is therefore, in simple terms, does the respondent in such a situation not act and thus passively cause the death of 5 people, or does he act, thus ‘intentionally’ killing one person?[1]

The experiment highlighted, among other things, the difference in preferences between collectivist and individualist societies. It also revealed significant differences between respondents from Western and Eastern cultures regarding their tendency to be passive (not to act). The diversity of responses and the country- and culture-specificity of them raise an important question. To what extent is it possible to create a stable basis for programming machine learning models capable of making ethical decisions by democratizing these decisions?

The situation is further complicated by the basic architecture of machine learning solutions. As mentioned above, they all learn the knowledge needed to solve their task from the data they receive. The example of the Moral Machine is a clear case where (if the answers were used directly as training data, for example in the case of self-driving cars judging traffic situations) a certain morality is already encoded in the training data from the start.

Current machine learning models, on the other hand, are inherently task-oriented, i.e., they are typically designed with a single goal in mind and are only capable of solving that goal (e.g., ChatGPT, which operates as a chatbot, is ‘just’ a state-of-the-art Question Answering system). The problem is that in most cases we have no insight into whether there is any moral value present in the data used by the algorithm to generate the model that makes the decisions[2]. There are several examples where a model has accumulated racist or sexist biases during training, but these were only discovered after commissioning and use. A theoretical solution could be to further train existing models with data specifically representing preferred moral values. Although in practice this is often not feasible due to the technical features of machine learning models. As described above, machine learning algorithms store the acquired knowledge in a model. This model is later responsible for making decisions at the end of the learning phase. Unfortunately, if we show new examples of the already existing, well-functioning model, this result does not add to the ‘knowledge’ already present in it but overwrites it (catastrophic forgetting). As a result, the model – adding the new (ethical) information it has received – does not keep its original function but rather forgets its original purpose.

So, the solution is clearly to filter and validate the data before the training phase to ensure that the output of the model is working according to the expected ethical standards. Let us add that this is not a trivial task, given the amount of training data, especially since the potential biases present in it are in many cases not obvious at all, or even easily traceable.

The Moral Machine experience provides important information for the development of future machines that interact with and move among people, but it also leaves at least as many questions unanswered. Perhaps the most important of these, as we have already touched on, is whether the majority decision in determining moral values always correlates with the ethical decision. One need only think of WWII Germany to question this, but we must also remember that the values of the modern Western world have been shaped for centuries by Christian culture, which is strongly reflected in our views on ethics today. But this cannot be seen as a cross-cultural universal.

Moreover, due to the data-driven nature of machine learning, AI researchers would need to characterize ethical aspects as highly explicit, measurable, and quantifiable values to encode them into machine learning models of today and the near future. This would require at least the full agreement of human respondents (e.g., during data collection), which is not impossible, but again problematic. There is also no guarantee that a theoretical decision taken in a fictitious situation would be taken in the same way. Would a respondent who, in the case of the trolley problem, would pull the tram over to save 4 lives in the condition of the question, do so in reality, knowing now that his own active decision would cause the death of a person? However, it is also the case that the responses collected from respondents tend to reflect very specific situations. In real life, similar decisions may be required in a myriad of situations. Just as in the case of law, in the world of machine learning it is not possible to describe every conceivable situation in a concrete way in advance. This is why generalization must be the task of the model. The task to be solved in this context is therefore to ensure this capability of the model by describing only a few concrete situations and specifying the expected result.

The increasing awareness and popularity of the problem is well illustrated by the fact that in recent years a specific research direction has emerged. Such research aims to align the output of machine learning models to a general expectation, thus reducing the uncertainty that is still often present in the operation of ML solutions (AI alignment). In most cases, proponents of this school of thought argue that human decision-making is influenced by several factors (arising from the actual context of the decision). Examples of this might include individual experience, cultural norms, empathy, and personal values. All this allows us to distinguish between right and wrong. The aim is to ensure that this additional knowledge can be considered by machine learning models in each situation.

It should be noted, however, that the trend refers the solution mainly to the competence of the developers since they can have the greatest influence on the output of the models. However, according to the opponents, this raises several concerns. The most important one is that in such a case, only a small group of people would determine what is ethical. This is strongly opposed to the previously mentioned democratized approach. This is mainly a consequence of the fact that the supporters of AI alignment treat the coding of ethical principles into machine-learned solutions mainly as a technical and not a social, legal, or even philosophical problem.

Another counterargument is that AI alignment seeks to address problems that can only be addressed in a meaningful way in the context of AGI. This implies that the time and money spent on AI alignment will divert valuable resources from solving societal problems that already exist today. These problems are mainly caused by the irresponsible use of currently applied AI solutions. It is also worth noting that the concept of AI alignment is, in most cases, very vaguely defined. This vagueness may be a consequence of the fact that it is still a very young research area, which is only at the stage of finding its way.

As AI entwines deeper into our tomorrow, unlocking ethical decision-making is the key to its success. Additionally, the review of current solutions is becoming increasingly urgent, even though the problem is only now beginning to receive attention. The current state of AI research suggests that the best way to ensure that the expected moral principles are met is through careful selection of the training data and continuous, targeted quality assurance. Democratizing AI and increasing the transparency of its development are also essential to the solution. To make AI a truly life-enhancing and human-enabling technology, it is essential to prevent misuse and limit irresponsible applications, in which the legislator will have a key role to play.

[1] There are several variations of the problem, in which the age, health status, and even whether the respondent knows or is related to a person on the track, as described, may vary. These are all factors that have been investigated in a number of experiments to see how they relate to the respondent’s reaction to the problem.

[2] More on the process and main components of machine learning in our previous post.

István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.

Gellért MAGONY: In Too Deep(fake) – Suggestions on How to Avoid Harmful Social Influences of Deepfakes

In March 2023, a footage was shared on the Russian social networking site VKontakte of President Zelensky announcing that Ukraine would surrender to Russian invasion. The video also started to spread on Facebook and YouTube. Shortly after the video went viral, the real President denied the video on his social media page, and it was quickly removed by the social media sites. The video was a low-quality deepfake, the President’s likeness looked unnatural, his face did not match his body and his voice was different from the person the video was based on (the target). Thanks to this and the quick reaction from him, the fake video footage did not cause much damage. This case is just one alarming example of the many instances of artificial intelligence-generated or manipulated content that have been released, often causing widespread panic. 

But what is a deepfake? A media content in which a person’s features are altered by Artificial Intelligence to make them look like someone else. The term is a combination of the terms “deep learning” and “fake“. So, the creator deliberately wants to deceive the recipient of the multimedia and does it by using machine learning and Artificial Intelligence. Therefore, not all audiovisual content manipulated by artificial intelligence falls into this category. AI-based manipulation is also used legally, for example in the film industry, by using large amounts of voice recordings of an actor – living or deceased – to read out texts in their voice. 

The technology has (over)developed in recent years. Although the theoretical foundations of it were laid in an academic paper in the 1990s, it entered the public domain in 2017 when an anonymous user created a deepfake algorithm using existing algorithms and made it available for free on the internet. The initial videos were astonishing, but it was obvious to anyone that something was wrong with the footage, that it had been manipulated. The time it took to make the footage, which initially took a week of post-production, was reduced to a few days, and soon software appeared that allowed real-time conversion. One of the best-known and most authentic deepfakes is the TikTok channel Deep Tom Cruise, created by an actor who looks a lot like Tom Cruise. The channel produces videos for entertainment purposes, not for deception. However, it draws our attention to the fact that it is possible to create a completely deceptive result from a combination of an actor who resembles the target person from the ground up and a lot of sophisticated post-production work. 

In the following, it will be presented what must be achieved first and foremost to ensure that the use of deepfake technologies does not lead to practices that adversely affect society and violate fundamental rights.

A major upgrade of deepfake-detecting technology is inevitable

The most “primitive” of deepfake detection technologies is the free-eye monitoring. Low-quality video content can be spotted after a close and thorough examination. The most common clues are unnatural eye and mouth movements, speech being out of sync, unnatural facial twitches, uneven skin tone, etc. However, these signs are only easily noticeable if the viewer is aware that the technology exists. Someone who has no knowledge of deepfakes at all will not even think to look closely at these typical signs. Also, in response to deepfake videos, artificial intelligence technologies have emerged to detect them. There are several paid and free services available on the internet that can tell in a matter of seconds whether an uploaded video is a deepfake or not. Most of them are highly accurate and aggregate the results of several detection algorithms. The problem is therefore not so much the effectiveness of this software, but the fact that the average user has no such tools at his disposal or is not even aware of their existence. Filtering should therefore be implemented first and foremost by incorporating it into the filtering mechanisms of social networking sites and addressing the problem at its root. For example, Facebook has announced a competition for detection algorithms in 2020. Competitors had to work with a database of over 100,000 samples. The most accurate algorithm was able to filter out fake photos with 82% accuracy. Unfortunately, the figure is not reassuringly high, leaving a large margin for higher-quality fakes. Several other well-known social networking sites have also announced changes to their community guidelines. TikTok banned “synthetic media” in April this year. It is only possible to upload fake videos to the platform if they are clearly labeled “synthetic”, “fake”, “not real” or “altered”.

Deepfake-specific regulation is needed

The widespread use and increasing realism of deepfake technology requires urgent and thorough legal regulation. It is not possible, nor would it be reasonable, to ban the technology outright, as it has many useful and creative uses. It is widely applied by the film industry and can also be used to produce entertaining, humorous videos that do not infringe the law. However, there is a need to regulate the technology specifically. In the US, three states have passed legislation. Virginia focuses on pornographic deepfakes, while Texas and California on disinformation aimed at influencing election results. Also, several federal initiatives have been taken but have not yet been adopted. The United Kingdom has already indicated its intention to introduce a law banning the unauthorized use of pornographic deepfakes. China has already enacted a deepfake-specific regulation, which is currently the most detailed and stringent deepfake legislation in the world. The law prohibits anyone from creating deepfake content without the subject’s permission and from depicting or saying anything that could be considered contrary to national interests. Anything that is contrary to socialist values is included, as is any form of “illegal and harmful information” or the use of AI-generated human imagery to deceive or defame. The law is part of China’s strategy to become a world leader in comprehensive technology regulation.

Special rules are needed for electoral procedures

Deepfake content can be particularly dangerous during election campaigns. The use of a deepfake video recording in the run-up to election day can bring in a lot of votes for the abusive party and can be particularly helpful in convincing undecided voters. Usually, deepfakes are identified with the videos, but by means of fake audio recordings an even more dangerous use can be made. This is because it is more difficult to prove where and when a so-called ‘leaked’ audio recording was made. One can find several parodies made for entertainment purposes that include dialogue between current and former US presidents – and they are very convincing fakes. On this basis, there are just enough audio recordings available of a politician who is less significant globally but crucial to the outcome of the election in a smaller country. And since the election campaigns of recent years, both at home and abroad, have shown that politicians are willing to use any unscrupulous means to win, there is no reason why they could not use the persuasive power of deepfake technology. 

The three main changes that should be implemented – enhanced use of deepfake-detecting technology, deepfake-specific regulation, and special rules for electoral procedures – are essential to ensure that deepfakes do not have an irreversibly damaging impact on our society. There are risks involved in the use of this technology in several areas, such as data protection and privacy rights, the potential for manipulated content to have a detrimental impact on the political process and public opinion, and even the difficulty of proving the credibility of evidence used in legal proceedings. Appropriate legislation should therefore aim to set the framework for ethical use. For deepfakes, as for any emerging technology, it is important that the legal framework keeps pace with the rapid evolution of the technology.

Gellért MAGONY is a student at the Faculty of Law and Political Sciences of the University of Szeged, Hungary, and a scholarship student of the Aurum Foundation. His main area of interest is the relationship between the digital world and law. His previous research has focused on the relationship between social networking sites and freedom of expression and the statehood of metaverses. He is currently researching social influence through deepfakes.

István ÜVEGES: The Democratization of AI? Advancing Digital Sovereignty and Fostering Innovation

The EU’s strategic autonomy depends on technology, but most technology giants are not European-owned. This is a serious handicap that the EU is seeking to overcome by strengthening digital sovereignty. Democratizing AI and creating a regulatory environment that can support competitiveness is an important step in this direction. If the EU does not want to be left behind in the global race for AI, it is time to act now.

The emergence of Foundational Models and their subset, Large Language Models (LLMs), has revolutionized the world of Artificial Intelligence and the research projects and industrial applications that use AI in the last few years. During this period, the development of neural network-based solutions has been progressing at a pace (in many respects exponential). This phenomenon is testing the timeliness of both the users of such technology and the readiness of policymakers to deal with the legal regulatory challenges posed by these new methods. From the EU perspective, the fact that the development of the latest solutions is largely taking place outside the EU is a major strategic disadvantage. This significantly increases the EU’s exposure to market players operating outside its jurisdiction. In this post, we therefore briefly review;

  • the latest developments in Generative AI (GAI), the trend that is most dominating the development of AI today, 
  • its relation to the EU’s objectives of digital sovereignty
  • along with the role that the democratization of AI can play in this.

In Pierre Bellanger’s original definition, the concept of digital sovereignty refers to the ability of an actor to freely control its own digital data. This includes control over their entire digital environment, both the software and data they use and the hardware used for operational tasks. The concept is generally divided into two connected smaller units, the first being ‘data sovereignty’ and the second ‘technological sovereignty’. For the former, the critical points are the location of data storage and processing, the range of people who have access to the data, and the laws governing the storage and use of the data. For the latter, the place where the technology is deployed, the identity of the creator and operator, and the lawful use, or prohibitions on use, are important considerations.

These components are also particularly problematic for the EU because, based on current trends, a significant part of the modern Western world’s data is stored in the US. Additionally, the vast majority of AI innovation is also born there. The EU’s concept of digital sovereignty aims to create a viable and sustainable alternative to this kind of inequality. One means of doing this can be the strengthening of regulatory autonomy (one of the most iconic examples of which is the GDPR, in force since 2018). The EU wants to move towards digital sovereignty mainly by keeping data generated in Europe within the continent. This will be facilitated by the development of a single EU regulation and the promotion of the idea that data should be stored and processed primarily through European IT companies.

The primary objective of achieving the above is to overcome the strategic, geopolitical, and cybersecurity risks and disadvantages caused by the significant technological dependence on non-EU actors. At a time when the rapid spread of AI-based applications is also a key to competitiveness, such a unilateral dependency could easily put the EU on the defensive. This could also erode its role from a position of an initiator to a merely passive player. Such a shift could also have a potentially negative impact on its global influence and decision-making autonomy. In addition, the lag in the development of Artificial Intelligence (even within the established Atlanticist perspective) represents a serious vulnerability and exposure.

There are several solutions to prevent such and similar disadvantages. One solution is a significant change of attitude towards the creation of an ‘entrepreneurial state’. Another solution is the democratization of AI. Both could benefit EU market players, even at the level of small and medium-sized enterprises, in the short term.

The concept of the ‘entrepreneurial state’ first came to prominence with the publication of Mariana Mazzucato’s The Entrepreneurial State: Debunking Public vs. Private Sector Myths in 2013. The main claim of the book is that the traditional view that the private sector is usually the driving force behind innovation and thus the most important source of experimental investment in successful economies is fundamentally wrong, or at least outdated. 

The author argues, for instance, that the most important factor behind the success of the US economy has been public and state-funded investment in innovation and technology. This contrasts with the view that the basis of a successful economy is minimal state involvement and the enhancement of the free market. This, of course, requires a move away from the perception of the state as a mere bureaucratic machine. Simultaneously, it necessitates a shift towards a policy where the state takes on a leading role as a risk-taker in investing in innovation. As a key conclusion, the author outlines a trend. This trend indicates that industrial actors have, in many cases, become involved in the development of a technology only after the state has started investing in it. An iconic example is the development of Google’s search algorithm, initially funded by the US National Science Foundation.

A similar change of perspective at the EU level could also act as an incentive. With the right planning, it could spark innovative technological and other initiatives at the EU level or at the Member State level. Given that one of the EU’s current priorities is to create the most comprehensive regulatory framework possible for the development and use of AI-related technologies, the impact of regulations on the actors operating in the EU will be a crucial issue for the future. More specifically, the key question may be whether this essentially forward-looking initiative will proliferate and create a kind of opaque, over-regulated market environment, or whether it will become a breeding ground for a whole range of ‘entrepreneurial’ EU states. It may give cause for concern that in the context of digital sovereignty, related organizations (e.g., European AI Alliance) regularly identify the provision of the necessary funding and significant hardware resources for AI development as important, as well as the attraction and retention of the necessary expertise within the EU. 

The concept of digital sovereignty can also draw on another growing trend that has gained increasing international emphasis in recent years, namely the democratization of AI. In a previous post, we briefly discussed the areas where the ‘black box’ phenomenon poses a risk, for example in the case of Language Models. Here we have also briefly discussed that the 3 main components of virtually any machine learning project are the algorithm, the data used for training, and the resulting model. If any of these are not public, i.e., not freely available to anyone, then the solution should be considered (at least partially) as a black box.

The counter-pole to this latter phenomenon is the initiative known as the democratization of AI, where the main goal is to make all the components mentioned above fully public and open-source. The advantage of open-source solutions is that they allow companies with limited expertise and resources to develop their own AI solutions. Without open source, these companies would not be able to innovate in the same way. 

A prime example is the development of Foundational Models, which currently require the kind of hardware resources that only a few technology giants in the world have (Meta, Alphabet, OpenAI, etc.). The strength of these models is that they can be used for very general and diverse tasks out of the box, thanks to the huge amount of training data and the algorithms used for the training process. However, for such a general model to be applicable to a more specialized domain (e.g., as a legal chatbot or for extracting specific content – like, a summary – from legal texts), it is necessary to show more concrete examples of the model specific to the domain. The number of these specialized examples is significantly less than the amount of data needed for preliminary training. In many cases, even a few thousand hand-crafted training data may be sufficient to fine-tune the basic model for some more specific task. Thanks to this significantly reduced data and associated hardware requirements, such specialized models can be created by small companies. This is only possible if open-source Foundational Models are available in advance.

A shift in the AI industry towards community development and open-source models could benefit not just the EU, but all businesses around the world that use or plan to use AI at some point in their business operations. This shift has the potential to significantly improve their competitiveness and productivity.

Against this background, perhaps the most important issue for the EU in the coming years is clearly how to respond to the challenges posed by rapidly evolving technologies. There is a shift in the global market towards transparency and open-sourcing of AI (an excellent example is the initiative of Meta, who have recently released several fully open-source large language models, e.g., OPT-175B). However, such initiatives are meaningless if the regulation in place does not set proportionate limits on the use of new technologies. The protection of personal data, the prevention of electoral manipulation, or even the prohibition of data collection without consent are inevitable and necessary to tighten up in order to respect the right to privacy (all of which are also highlighted in the AI Act under negotiation). However, it should not be forgotten that new technologies must always have room for development. Companies need the freedom to experiment with them, as without such experimentation, competitiveness will be lost, and there will be a risk of marginalization in the global economy.

On the path toward digital sovereignty, democratizing AI emerges as a powerful catalyst for progress. By embracing open-source solutions and striking a balance between regulation and innovation, the EU can foster strategic autonomy and unleash its potential as a leading global player in AI technology. The EU’s pursuit of digital sovereignty gains momentum as it focuses on self-reliance and competitiveness without monopolistic ownership. Empowered by a single market approach, the EU stands poised to become the driving force behind consumer-centric AI systems. The time to act is now, as the democratization of AI promises to transform challenges into opportunities, ensuring a brighter and more innovative future for the EU and beyond.

István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.

Mónika MERCZ: Is it “I” or “AI”? – The legal questions of personality profiling by Artificial Intelligence

When we say the word “I”, it entails certain aspects of the self: our name, age, outlook on the world, as well as several other factors that shape who we are as a person. Oftentimes, we do not know the full extent of what this one syllable is comprised of. Despite us not knowing ourselves, we must take into account several factors, such as the fact that we have a blind spot: a part of ourselves that we do not know, but others could easily see. There are various dangers that come with us being blind to the perception of others, and with the introduction of artificial intelligence (AI), this discrepancy between what we see as “I” and what companies, the government or any private person capable of using an AI can see have drifted apart significantly. This has devastating implications when it comes to privacy, as we could quite literally lose control over who knows what about us and to what extent. This is why I believe that talking about the dangers that AI has when it comes to the possibility of creating a personality profile is crucial in today’s day and age. Are there legal safeguards in place to help keep the individual opaque and the state transparent? Where is the line which separates us as individuals and us as simple pools of data, easily used and known?

The mortifying ordeal of being known is in the present case truly mortifying in the real sense of the word, as AI can make a personality profile from basically anything we do on the internet: tracking cookies collect information about us, our fingerprints are capable of helping certain softwares recreate our facial features, just like facial recognition can become widely used thanks to its usage in unlocking our phones, and even at-home DNA testing has its dangers when it comes to our privacy. There is also a possibility for AI to copy a person’s voice after a few seconds of audio. Add this to the existence of deep fakes and we have a technology which could look, sound and think like any person whose personality the user wishes to emulate. In order to ensure that horrible misuse of collected data does not come to pass, we must take action in regulation and enforcement as well.

Firstly, I would like to stress that the Hungarian Constitutional Court has thought it imperative to stop a personal identification number from coming into existence even in 1991. In Decision 15/1991 (IV.13.), the right to the protection of personal data has appeared, where the Constitutional Court does not interpret the right to the protection of personal data as a traditional protective right, but as an informational right to self-determination, with regard to the active aspect of this right. This, even in 1991 meant for them that everyone has the right to decide about the disclosure and use of his/her personal data, and that approval by the person concerned is generally required to register and use personal data; the entire route of data processing and handling shall be made accessible to everyone, i.e. everyone has the right to know who, when, where and for what purpose uses his/her data. The principle of shared information systems and the prohibition of a single identifier is introduced to protect the citizen against the creation of a single “identity profile”. All of this is coming full circle when we take a look at the GDPR, with its principles woven throughout our interpretation of what data protection is and why it is necessary. This is considered to be a landmark case in Hungary not just because of its nature as a forebearer of GDPR’s principles, but because it is still affecting decisions made by the Hungarian Data Protection Authority. I would venture to say that its importance will come up again, but in the context of the new AI draft legislation, which would prohibit quite a few of the technologies that could be used for profiling.

The banning of personality profiling is a noble goal indeed. However, enforcement might pose a problem, as it would be extremely lucrative for private companies as well as useful for governments to know the ins and outs of people, and the possibility of paying a fine might not prove to be of much protection for us.

So, what are our options? As the possible consequences of personality profiling by AI are horrendous, stricter and stronger regulations are imperative. A profile made of individuals could cause serious societal problems, with individuals drifting apart from each other, losing their free will and identity. The Chinese system has shown us what some of the outcomes are in a digital dictatorship, and a more nefarious, subconscious approach could be used to influence humans across the globe, called “nudging”, which is a behavioral change caused by outside influences. The constant stream of content on platforms, the introduction of virtual reality and other technological advances all point towards a future where combining databases consisting of information about certain individuals is not only possible, but also not very difficult. There have already been some instances of AI’s influence culminating in unspeakable tragedy. For example, a man committed suicide because of his close relationship with and emotional reliance on a chatbot, Japan has certain workplaces where emotional recognition tools are already in use, and the Metaverse is full of possibilities to commit crimes.

In order to stop the negative effects of AI from spilling over to our everyday lives and to most importantly make profiling as scarce as possible, there are certain steps that must be taken. The protection of the individual must be given priority through legislation at the level of the European Union and the Member States as well, and these pieces of law should be enforced to the highest possible degree. The fact that the AI draft legislation contains among its prohibited Artificial Intelligence practices “guarantee that natural persons are properly informed and have free choice not to be subject to profiling or other practices that might affect their behaviour” is – in my opinion – not enough to respond to this level of risk. Prohibition is only effective if there is significant power behind its enforcement. This is why the European framework for AI should also have a body strictly working on not just AI governance, but profiling in particular. Dangerous and prohibited AI technologies deserve the highest level of attention that the EU can give. 

Social programmes are also desperately needed to strengthen communities and families, so that negative societal effects might be mitigated. Additionally, education on the dangers and proper usage of AI, raising user awareness is a key component for a successful transition into the next phase of our lives, where artificial intelligence lives with us. Strengthening data protection across the globe and having legislation put in place as a safeguard should be the goal of all countries. However, the perception of data protection itself varies from culture to culture and from legal system to legal system. Significant players such as India are not planning on adopting AI regulation, which is also a problem for a possible cooperation.

Because AI is a worldwide phenomenon and all countries could possibly have a role to play in its development, a bigger approach is needed than the European Union’s attempt at rectifying a frightening situation. Of course, it is a start and we must celebrate all victories. However, the path towards reliable and safe AI is quite long. Personality profiles in particular deserve our attention, so that the “I”, the self of an individual is at least protected from outside forces which would aim to influence it, use it, commercialise it or otherwise reveal it to the world. 

What does the future hold for AI profiling? We will probably only get an answer once the draft regulation in the EU has been in place for a few years, or when we acknowledge that we have let the genie out of the bottle with the unleashing of AI (which has a black box, so we know little about its inner mechanisms). We must dare to look at the intricacies of our new world with an interdisciplinary approach, ask questions and hold companies accountable or accept that hope is all we have left – which could usher in a less than favorable lack of right to self-determination during the next decades.

Mónika Mercz, JD, specialized in English legal translation, Professional Coordinator at the Public Law Center of Mathias Corvinus Collegium Foundation while completing a PhD in Law and Political Sciences at the Károli Gáspár University of the Reformed Church in Budapest, Hungary. Mónika’s past and present research focuses on constitutional identity in EU member states, data protection aspects of DNA testing, environment protection, children’s rights and Artificial Intelligence. Email:

István ÜVEGES: Europe’s AI Legislation Sparks ‘Black Box’ Debate: Unraveling Tech’s Gifts and Risks

As the European Parliament tackles AI regulation, the mysterious ‘black box’ phenomenon demands attention, raising concerns over AI’s opaque nature and potential misuse. Unraveling this enigma becomes crucial as it impacts powerful algorithms like Large Language Models, shedding light on transparency, accountability, and ethical use in our ever-changing digital landscape. How does European legislation address the ‘black box’ enigma? 

In the world of Artificial Intelligence, “black box” is a metaphorical term that refers to the fact that the way a system works is not necessarily transparent or fully understood by humans. To understand this phenomenon, let’s see the processes behind today’s most powerful algorithms (e.g. ChatGPT developed by OpenAI), whereby AI acquires some kind of “knowledge”.

Artificial Intelligence research intertwines various disciplines such as computer science, psychology, neuroscience, and more, encompassing diverse sub-fields like robotics, natural language processing, expert systems, and computer vision, making it a profoundly interdisciplinary and multifaceted endeavor.

The world of machine learning is, today, probably the most important of these, both in research and in related industrial development. Machine learning, viewed from above, consists of three vital elements: an algorithm, sample training data, and the resulting model. The algorithm’s prowess lies in learning new ‘knowledge’ from vast examples, uncovering imperceptible patterns in the data. As this ‘knowledge’ takes shape, it finds its home within the model, paving the way for AI’s transformative capabilities.

The schoolbook example is how Large Language Models (LLMs) work. The impact of LLMs on the modern world is clearly illustrated by the GPT-4 language model on which ChatGPT is based, but also by the LLM behind Google’s recently announced Bard (intended to be a market leader) conversational AI tool. The most common task of LLMs is to collect, interpret and store information about human language and/or information that is present in human language in written form. Such information could be, for instance, the grammatical rules of the language, parts of speech of individual words, the meaning of the words that make up the language in relation to the language itself, or the set of correct answers to questions, if the model is expected to work like a chatbot. The training data will be huge text databases, and the result will be a model (mentioned above) that can be further trained for a number of more specific tasks (such as sentiment analysis, question answering, etc.).

The problem with black box models can be understood in two ways. The first is the source of opacity. Imagine that the algorithm is not in the public domain, making its exact mechanism of operation unknowable to others except for the developer. Even if this would be the least possible scenario, it is still spooky. In the second case, the nature or source of the data used to train the language model is unknown to the general public. An iconic example of this is the well-known ChatGPT, which, according to some sources, uses about 570 Gb of text data to train, but the exact nature of this data is not public, or at least the developer has not yet provided any precise information about it, nor is the data itself available on any public platform. In the latter case, it is the model that is the result of the process that is not accessible. This, of course, makes it impossible to analyze or interpret the information stored in it using different procedures.

In most cases, these decisions are based on some legitimate business interest of the developing company, such as maintaining a competitive advantage or the expectation of a return on investment. If a complete machine learning process can be easily reconstructed by other market players (a basic requirement for research projects, for example), such competitors can gain a significant advantage, for example by no longer having to develop their own solutions from scratch, saving considerable time and money.

In addition to the above, the term black box can also refer to another feature of machine learning, which is particularly important in the world of neural networks and deep learning. The three possibilities mentioned so far can be attributed to the emergence of the black box character, which is essentially due to human intention, and therefore their solution is trivial in principle (even if difficult to implement and enforce in practice); making the individual components public. However, in many cases, the internal workings of the models themselves can become so complex that even if they are made public, it is questionable how it is possible to interpret the individual results that arise from their use.

Again, taking the example of language models, and the neural networks that underpin most language models, the inner workings of such models are in many cases opaque even to experts. Models are generally given some input to which they produce an output in response. In image processing, for example, the input might be an image and the output might be a decision about what is shown in the image. In language models the input may be a sentence and the output a decision about the emotional content of that sentence (emotion analysis). But a similar example in the field of law is the automatic generation of a summary of the content of court decisions, or the automatic identification of their structural elements (case history, court decision, etc.). In many cases, the link between input and output is provided by hundreds of millions or even billions of parameters inside the model. These are actually connections between the neurons in the network with a numerical value that the model uses to encode the information it learns and that helps it make decisions. In order to understand how input is transformed into output, we need to understand the relationship between these connections, the values they store, and the input-output pairs. Obviously, this is an extremely difficult task (if only because of the number of parameters involved), which is an intractable problem for human reason alone.

If the background to the decisions of a machine learning model cannot be known due to deliberate decisions or simply due to some inherent characteristics of the technology, this can lead to a number of adverse consequences. 

The bias of AI reflects the phenomenon where, due to conscious development decisions, unnoticed errors, or even biases introduced into the data with malicious intent, the decisions given by the model tend to be disadvantageous for a group (e.g., minorities). To put it simply: if a bank uses a machine learning model to rate a loan, the model may easily conclude that the return on the loans extended in a particular area is low and that lending is risky and should be avoided. If the majority of the population in the area belongs to some minority (e.g., ethnic) group, it is easy to generalize that members of the minority group may never receive a positive credit rating because of an (even accidental) correlation in the training data. Real-life problems arising from AI bias are in many cases more nuanced than the above and very difficult to detect and can therefore cause significant harm if not handled effectively, in addition to cases such as the above which may also violate the prohibition of discrimination.

The lack of transparency and accountability is a particular problem for neural models, based on the unanalyzable relationship between input and output. For medical applications, it is of particular importance that, for example, in the case of a diagnosis, not only the outcome but also the chain of causes and conclusions leading to it are made available to experts. If the reasons for the model’s decisions are not known, the question of responsibility for erroneous decisions remains unclear.

Compliance with legal and regulatory requirements can be difficult when using black box AI systems. Some regulations, such as the GDPR, guarantee individuals the right to an explanation and require transparency in automated decision-making. However, if the inner workings of an AI system cannot be interpreted or explained, compliance with similar regulatory requirements may be simply impossible.

Targeting the challenges of this rapidly evolving technology, the forthcoming European regulation focuses on the classification of AI-based solutions into risk categories and the introduction of restrictions on the use of these classes. In an era of a data-driven economy, where the main value is the data collected from users, a looser regulatory environment is a competitive advantage, even if the associated moral and ethical concerns and the social harms of irresponsible use of technology may negate these initial benefits in the long run. The key issue is to find a balance where the regulatory environment makes large companies much more accountable than at present for the consequences of the technologies they use and the data they collect for them, but supports responsibly deployed AI-based solutions to increase competitiveness and harness the real benefits of such technologies. Given that AI is perhaps the fastest evolving field today, this distinction may become more difficult to make with each new technical advance.

István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.

István ÜVEGES: Social and political implications of the use of artificial intelligence in social media

Artificial intelligence-based algorithms are now of inescapable importance in many fields. Their applications include automatic content recommendation systems for streaming providers, chatbots (e.g. ChatGPT), Google’s search interface, etc. The applications listed above are designed to help users make decisions, find information, or organize the vast amount of information available online to make it easier to find what they are looking for. In fact, many of the most popular online services are nowadays unthinkable without the use of artificial intelligence since they make the navigation efficient and accessible to all in the vast amount of data present in the online space.

In addition to the above, however, other uses of digitized data can be envisaged, which are less obvious and are not necessarily aimed at satisfying the needs of the average user, but rather at serving market or political interests, even though (conscious or unintentional) invasion of privacy.

In the world of artificial intelligence, and specifically in its subfield of machine learning, the quantity and quality of training data is a key factor. In the traditional sense, privacy in the online / digital space can be defined as private conversations, social media posts and information related to the individual. However, in addition to these, users leave behind several online footprints that are either not protected at all or are protected by inadequate means by the legal rules on privacy.

Examples include data sets such as browsing history, content viewed, ‘liked’, individual contact networks, geolocation data, etc. Until the last decade, this information has existed mostly in isolation, on separate servers, under the ‘authority’ of different data controllers or collectors. However, from the point at which these data sources became interoperable (whether through the activities of data brokers or otherwise), they have given rise to a mass of data (mostly referred to as ‘big data’) which nowadays offers the possibility of psychological profiling of the source individual, micro-targeting of ads and content, or even the use of psychometric methods.

Unlike traditional information that people are basically aware of sharing (for example, uploading a photo), this data is often generated in ways that the user is not necessarily aware of. Nevertheless, by using it, machine learning algorithms can be a much more effective tool than before for profiling an individual, whether it is (automatically) recognizing and attributing values to a person, be it party preferences or other interests. Mapping groups thus formed (e.g., by unsupervised machine learning algorithms) back to the individual is the key to developing effective and automated opinion-forming techniques.

The process by which data is “turned into gold” in the right hands[1], and the ways in which it can be used to serve business or policy interests is a multi-stakeholder process that involves a range of technological innovations, emerging trends, regulatory challenges, and perspectives.

In response to the insatiable demand for data from machine learning algorithms, there is now an entire industry dedicated to collecting and selling user data in the most efficient and detailed way possible. Given the rapid progress in both IT and artificial intelligence research, it is reasonable to assume that the problems we are already seeing (data leaks, manipulation, micro-targeting, psychometric profiling, etc.) will only get worse in the future without the right regulatory environment or may be replaced by new challenges that are not yet foreseen.

Among the (already existing) uses of artificial intelligence that are of concern, this paper presents some of the ways in which it can be used to influence election outcomes. The issue of political polarization in social media is also discussed in more detail.

Electoral manipulation

In modern democracies, weaponized / manipulative AI poses a serious threat to the fairness of elections, but also to democratic institutions more generally. In the case of elections, the outcome can be influenced in several ways, in line with the interests of a third party.

The attacks, carried out by artificial intelligence used for malicious or even economic, political interests, can take the form of “physical” attacks (such as paralysis of critical infrastructures or data theft), or psychological effects that poison the voters’ trust in the electoral system, or discredit certain public actors[2].

In the present context, micro-targeting refers to personalized messaging that has been fine-tuned based on previously collected data about a given user, such as an identified psychological profile. Messages targeted in this way are much more likely to influence or even manipulate opinion than traditional advertising techniques.

This is exemplified by the suspicious cases of abuse uncovered by the Mueller report[3] in the US in connection with the 2016 presidential election, one of the main arenas of which was/is social media platforms.

The heightened concern about such activities is illustrated by the fact that, following the introduction of the GDPR[4], several EU Member States have initiated investigations against companies involved in data collection. For example, the Irish Council for Civil Liberties (ICCL) report[5] raises serious concerns about the activities of Google and other large-scale operators whereby data collection companies auction information about users, linked to their real-time geolocation, to potential advertisers and then transmit the data packets to the ‘winning’ bidder (Real Time Bidding – RTB). In several of the cases studied, the data transmitted in this way included sensitive health characteristics such as diabetes, HIV status, brain tumors, sleep disorders and depression[6].

The report found that in some cases, Google’s RTB system forwarded users’ data packets (which may have included the above-mentioned sensitive data without filtering) hundreds of times a day. The value of the data, and the seriousness of the leak, is illustrated by the fact that (also according to the report) it was used by some market/political actors to influence the outcome of the 2019 Polish parliamentary elections.

In doing so, OnAudience used data from around 1.4 million Polish citizens to help target people with specific interests when displaying election-related ads. According to the company, although the data packets were processed and transmitted anonymously, they were still uniquely identified to specific, real individuals. Moreover, these identifiers can be linked to the databases of other companies and thus continue to form a single profile[7]. This implies not only a threatening market behavior in terms of compliance with the GDPR, but also in terms of violation of privacy rights.

Opinion bubbles and political polarization

In addition to the above, it is also significant that social media platforms, to maximize users’ time on the platform, typically present content that best matches the personality of the user, i.e., that is most likely to be of interest to them.

This kind of (AI-enabled) content pre-screening has highlighted two new and important problems in recent years. The first is the problem of the often-false positive feedback generated by the homogeneity of the ranked content, and the second is the issue of political polarization often associated with it.

The former is driven by the phenomenon that social media platforms are making it possible for people to connect with others who share a similar worldview to their own on an unprecedented scale. This kind of social selectivity, coupled with the content filtering technologies[8] of the platforms, results in the creation of psychosocial bubbles that essentially limit the extent of possible social connections and interactions, as well as exposure to novel, even relevant information[9].

This phenomenon has been studied since the 2010’s, mainly based on informatics and structural measures of online behavior and social networks[10]. Among the later research, the Identity Bubble Reinforcement Model (IBRM)[11] stands out, with the dedicated aim of integrating the social psychological aspects of the problem and human motivation into the earlier results. According to this model, the expanded opportunities for communication and social networking in social media allow individuals to seek social interactions (mainly) with people who share and value their identity. This identity-driven use of social media platforms can ultimately lead to the creation of identity bubbles, which can manifest themselves in three main ways for the individual:

  • identification with online social networks (social identification),
  • a tendency to interact with like-minded people (homophily)
  • and a primary reliance on information from like-minded people on social media (information bias).

Within social media, these three elements are closely correlated and together reflect the process of reinforcing the identity bubble.


The data generated online can also be used to make predictions about users’ personality traits. One of the priority areas for these is psychometric use. This is closely related to the use of the online footprint (and its connection with the right to privacy and confidentiality) and is now also known as a possible technique for influencing voter opinion.

Psychometrics (psychometrics – psychometry) is the field of psychology that deals with testing, measurement, and evaluation. More specifically, the field deals with the theory and techniques of psychological measurement, i.e., the quantification of knowledge, skills, attitudes, and personality traits. Its classical tests aim to measure, for instance, the general attitude of employees in a work environment, their emotional adaptability, and their key motivations, but also include aptitude tests to assess the success in mastering specific skills, or classical IQ tests as well[12].

In the context of social media, and big data in general, the concept came to the fore mainly in the context of the 2016 US presidential election, along with another technique, micro-targeting.

The name of Cambridge Analytica, which first received significant media attention in July 2015, shortly after the company was hired by Republican presidential candidate Ted Cruz’s team to support his campaign, is inescapable on this topic.[13]. Although the campaign was unsuccessful, Cambridge Analytica’s CEO claimed that the candidate’s popularity had increased dramatically thanks to the company’s use of aggregated voter data, personality profiles and personalized messaging / micro-targeting techniques. The firm could also have played a role in shaping the outcome of the Brexit campaign according to a familiar scenario[14]. In 2016, it was also suspected that US President Donald Trump had also hired the company to support his campaign against Hillary Clinton. In this context, there are reports that Cambridge Analytica employed data-scientists who enabled the campaign team to identify nearly 20 million swing voters in states where the outcome of the election could have been influenced[15]. Winning voters in these states could ultimately and significantly boost Trump’s chances in key states, as well as in the general election[16].

The company also claims that one of the keys to their success has been the combination of traditional psychometric methods with the potential lies in big data. Their free personality tests, distributed on social media platforms, promised users more information about their own personality traits at no cost[17]. The data submitted could then be linked by Cambridge Analytica to the name of the submitter and a link to their profile[18].

The resulting data set (supplemented by other public and private user data) allowed the company to classify some 220 million US voters into 32 different personality types, which could then be targeted by the ads that most appealed to them[19].

Given the right amount of data, the method can be implemented in reverse; after collecting the same data from users who were not profiled by the survey as those who were surveyed, this data can be used as input for machine learning models that can then classify users who were not previously profiled into the personality groups mentioned above. Although the real success of Cambridge Analytica’s methods has not been clearly established, the moral, political and security concerns surrounding the company undoubtedly highlight both the potential of the use of online footprint data and the ways in which it can be used in ways that are legally unregulated or morally and ethically questionable.

Taken together, the above illustrates the potential lying in the use of the ever-increasing amount of data currently available on the internet. However, given that the so-called ‘data-driven economic model’ (where the primary source of profit is not industrial production, but peoples’ attention) is not yet fully developed, the ethical and legal concerns that have already been raised undoubtedly highlight the risks of further proliferation and refinement of AI-based technologies, leaving many questions unanswered.


Initiatives are already being taken to tackle these problems. For example, the European Union’s efforts to achieve digital sovereignty[20] seek to respond to the uneven distribution of artificial intelligence capacities (research, infrastructure) in the world, which is currently to the detriment of the Union. Significant progress has been made with the adoption of the GDPR in relation to the processing and use of personal data, but (as the above-mentioned report of the Irish Council for Civil Liberties and Justice reveals) it is far from clear that in practice what is an effective and appropriate way forward on issues that are not currently regulated and in terms of detecting abuse.

Given that the function of law is primarily to respond to social and technological changes that have already occurred by fine-tuning the regulatory environment, a comprehensive study of the problems related to AI from a legal perspective is also essential.

Another issue that is not discussed in detail in this article, but which is also of particular importance, is the question of the contrasts that the use of AI-based capacities concentrated in the hands of the state entails. Such capacities can be used both to defend liberal democracies and to build authoritarian (and/or surveillance) states, as the People’s Republic of China has done, for instance, by introducing a ‘social credit system’[21].

After examining the issues involved, perhaps the most important finding is the need to improve the regulations surrounding artificial intelligence, to update them to meet the challenges of the times, and to develop cyber defense procedures that can detect, predict and possibly prevent manipulative techniques using artificial intelligence.

[1] The quote refers to a common saying, especially in the United States, which emphasises the data-based dimension of economic growth: ‘Data is the new gold’. (e.g., Rachel Nyswander Thomas: Data is the New Gold: Marketing and Innovation in the New Economy (Accessed: 12. 22. 2022.)

[2] In addition, artificial intelligence can be used to amplify the effects of efforts to distort election results, such as gerrymandering, which are not really relevant to the topic of this paper. Cf. Manheim, Karl – Lyric, Kaplan: Artificial intelligence: Risks to privacy and democracy. Yale JL & Tech. 21, 2019 p. 133 – 135.

[3] Robert S. Mueller, III: Report on the Investigation Into Russian Interference in the 2016 Presidential Election:( Accessed: 12. 19. 2022.)

[4] (EU) 2016/679

[5] [5] Ryan, Johnny: Two years of DPC inaction on the ongoing RTB data breach – Submission to the Irish Data Protection Commission (21 September 2020):

[6] Ibid. 6-7.

[7] Ibid. 5.

[8] For example, ranking content in the newsfeed according to relevance and interests.

[9] Kaakinen, Markus –Sirola, Anu – Savolainen, Iina – Oksanen, Atte: Shared identity and shared information in social media: development and validation of the identity bubble reinforcement scale, Media Psychology, 23:1, 25-51, 2020, p. 25-26.

[10] Pariser, Eli: The filter bubble: What the Internet is hiding from you. London, England: Penguin, 2011

[11] Zollo, Fabiana – Bessi, Alessandro – Del Vicario, Michela – Scala, Antonio – Caldarelli, Guido – Shekhtman, Louis – Quattrociocchi, Walter: Debunking in a world of tribes. PloS ONE, 12(7), 2017

[12]Krysten Godfrey Maddocks: What is Psychometrics? How Assessments Help Make Hiring Decisions: (Accessed: 12. 22. 2022.)

[13] Vogel, Kenneth P. – Parti, Tarini: Cruz partners with donor’s ‘psychographic’ firm: (Accessed: 12. 22. 2022.)

[14] Doward, Jamie –Gibbs, Alice: Did Cambridge Analytica influence the Brexit vote and the US election? (Accessed: 12. 22. 2022.)

[15] Blakely, Rhys: Data scientists target 20 million new voters for Trump: (Accessed: 12. 22. 2022.)

[16] González, Roberto J.: Hacking the citizenry?: Personality profiling, ‘big data ‘and the election of Donald Trump. Anthropology Today 33.3, 2017, p. 9-12.

[17] The results could be evaluated according to the Big Five personality model, a long-established, fundamental concept in personality psychology research about the classification of an individual’s personality traits into factor groups. These main traits are extraversion, friendliness, conscientiousness, emotional stability, and culture/intellect.

[18] Harry Davis: Ted Cruz using firm that harvested data on millions of unwitting Facebook users: (Accessed: 12. 22. 2022.)

[19]  Confessore, Nicholas –Hakim, Danny: Data Firm Says ‘Secret Sauce’ Aided Trump; Many Scoff: (Accessed: 12. 22. 2022.)

[20] EPRS Ideas Paper – Towards a more resilient EU: Digital sovereignty for Europe: (Accessed: 12. 23. 2022.)

[21] Nicholas Wright: How Artificial Intelligence Will Reshape the Global Order – The Coming Competition Between Digital Authoritarianism and Liberal Democracy: (Accessed: 12. 23. 2022.)

István Üveges is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.